From mboxrd@z Thu Jan  1 00:00:00 1970
From: Chris Marusich <cmmarusich@gmail.com>
Subject: Re: Overhauling the cargo-build-system
Date: Sun, 08 Dec 2019 20:45:07 -0800
Message-ID: <87h82ap0oc.fsf_-_@gmail.com>
References: <20191010155056.GD1301@E5400> <87d0f4p6xd.fsf@gnu.org>
 <20191011141342.GC13364@E5400>
 <e2b9941a46d4d552652732d6d5f14be0@riseup.net>
 <20191010155056.GD1301@E5400> <87d0f4p6xd.fsf@gnu.org>
 <20191011141342.GC13364@E5400>
 <e2b9941a46d4d552652732d6d5f14be0@riseup.net> <87imnjtsk3.fsf@gnu.org>
 <20191117071934.GC12423@E5400> <20191010155056.GD1301@E5400>
 <87d0f4p6xd.fsf@gnu.org> <20191011141342.GC13364@E5400>
 <e2b9941a46d4d552652732d6d5f14be0@riseup.net> <87imnjtsk3.fsf@gnu.org>
 <87imnjtsk3.fsf@gnu.org>
Mime-Version: 1.0
Content-Type: multipart/signed; boundary="=-=-=";
 micalg=pgp-sha256; protocol="application/pgp-signature"
Return-path: <guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org>
Received: from eggs.gnu.org ([2001:470:142:3::10]:37350)
 by lists.gnu.org with esmtp (Exim 4.90_1)
 (envelope-from <cmmarusich@gmail.com>) id 1ieAv7-0000vR-3T
 for guix-devel@gnu.org; Sun, 08 Dec 2019 23:45:19 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
 (envelope-from <cmmarusich@gmail.com>) id 1ieAv5-0003v0-1b
 for guix-devel@gnu.org; Sun, 08 Dec 2019 23:45:16 -0500
In-Reply-To: <87imnjtsk3.fsf@gnu.org> ("Ludovic
 \=\?utf-8\?Q\?Court\=C3\=A8s\=22'\?\=
 \=\?utf-8\?Q\?s\?\= message of "Sat, 16
 Nov 2019 22:33:32 +0100, Sun, 17 Nov 2019 22:22:21 +0100, Sat, 16 Nov 2019
 18:35:46 -0800")
List-Id: "Development of GNU Guix and the GNU System distribution."
 <guix-devel.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/guix-devel>,
 <mailto:guix-devel-request@gnu.org?subject=unsubscribe>
List-Archive: <https://lists.gnu.org/archive/html/guix-devel>
List-Post: <mailto:guix-devel@gnu.org>
List-Help: <mailto:guix-devel-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/guix-devel>,
 <mailto:guix-devel-request@gnu.org?subject=subscribe>
Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org
Sender: "Guix-devel" <guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org>
To: Ludovic =?utf-8?Q?Court=C3=A8s?= <ludo@gnu.org>
Cc: guix-devel@gnu.org

--=-=-=
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable

Hi,

Ludovic Court=C3=A8s <ludo@gnu.org> writes:

> What I would have liked is to somehow replace the #:cargo-inputs
> argument (which is build-system-specific and thus =E2=80=9Copaque=E2=80=
=9D) with regular
> =E2=80=98native-inputs=E2=80=99 or =E2=80=98inputs=E2=80=99 field.

That would be nice.  However, it doesn't seem possible to express
Cargo's "dependencies" and "dev-dependencies" concepts using Guix's
current package DSL.

Consider the proc-macro2 and quote crates.  We added these two crates in
commit 2444abd9c124cc55f8f19a0462e06a2094f25a9d, in the same patch
series where we added #:cargo-inputs and #:cargo-development-inputs:

  https://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D35318

Here is the Cargo.toml file for proc-macro2:

  https://github.com/alexcrichton/proc-macro2/blob/master/Cargo.toml

  [dev-dependencies]
  quote =3D { version =3D "1.0", default_features =3D false }

And here is the Cargo.toml file for quote:

  https://github.com/dtolnay/quote/blob/master/Cargo.toml

  [dependencies]
  proc-macro2 =3D { version =3D "1.0", default-features =3D false }

Here is a diagram of their dependency relationship:

  +---------------+
  |     quote     | <+
  +---------------+  |
    |                |
    | dependencies   | dev-dependencies
    v                |
  +---------------+  |
  |  proc-macro2  | -+
  +---------------+

To Cargo, this cycle is not a problem, since "dev-dependencies" are
treated differently from "dependencies":

  https://doc.rust-lang.org/cargo/reference/specifying-dependencies.html

  "Dev-dependencies are not used when compiling a package for building,
  but are used for compiling tests, examples, and benchmarks.

  These dependencies are not propagated to other packages which depend
  on this package."

The reason proc-macro2 declares a "dev-dependency" on quote is because
proc-macro2 uses quote in its doc tests:

  https://github.com/alexcrichton/proc-macro2/blob/e82e8571460a0a0e00f52f01=
1a74a5e0359acf3e/src/lib.rs#L785

This relationship between proc-macro2 and quote cannot be readily
expressed using the current package DSL in Guix.  If you try to model
"dependencies" and "dev-dependencies" as "inputs" (or "native-inputs",
or some combination of the two), Guix will fail due to the cycle.

Presumably, proc-macro2 just needs the source of quote (and the source
of proc-macro2's other dependency, unicode-xid).  When Cargo builds
proc-macro2, it will take care of building quote and making it available
during proc-macro2's tests.  Guix "just" needs to provide proc-macro2
with the quote source.  You might think this poses a bootstrapping
problem for Cargo, but I guess it doesn't.  As long as Cargo has the
source for proc-macro2, quote, and unicode-xid, I guess it can build
proc-macro2 and quote in any order.

Unless we missed something in our discussion of patch 35318, there is no
easy way to express the relationship between proc-macro2 and quote
without changing (or mis-using) the existing package DSL.  In the same
way that the package DSL introduced "native-inputs" and "inputs" as
concepts to facilitate cross-compilation, one way to solve this problem
might be to introduce a new concept to the package DSL that would make
it possible for Guix to express this kind of relationship correctly.

However, in the discussion of patch 35318, everyone (myself included)
seemed opposed to changing the package DSL if we didn't have to.  For
example, in response to an earlier version of the patch series in which
we tried to map "dependencies" and "dev-dependencies" onto the "inputs"
and "native-inputs" concepts (which was probably an abuse of the package
DSL, since "native-inputs" is a cross-compilation concept), you said: "I
don't understand yet why you change the role of 'inputs' compared to how
it is in the rest of Guix."  Ultimately, we decided not to modify the
package DSL or the meaning of "inputs".  Instead, we decided to encode
the necessary information about dependencies in the cargo-build-system's
arguments.  That is how we arrived at #:cargo-inputs and
#:cargo-development-inputs.

By introducing #:cargo-inputs and #:cargo-development-inputs as package
arguments to the cargo-build-system, we were able to solve the cyclic
dependency problem in one specific way.  Perhaps there are better ways.
I agree it would be nice if it were integrated into the package DSL.  I
think that changing the package DSL to suit our needs might work, but
I'm not sure how to change it without making it too Cargo-specific.

> I know it=E2=80=99s not that easy with Rust and Cargo, I just never manag=
e to
> fully grasp why :-), but at least that should be our horizon IMO.

Well, you're not alone!  I'm not (yet!) an expert in Rust, but I find
these problems difficult to understand, too.  Cyclic dependencies are
just one issue.  There are other problems, too.

One problem that Efraim has mentioned is that every crate wants all of
its sources to be available at build time.  It's as if each crate wants
all of its source crates (and all of their source crates, transitively)
to be propagated into the build environment, even if not all of them are
used.

Another problem, which Efraim has also mentioned, is that it seems hard
to "cache" the result of building a crate.  So every crate wants to
build its transitive closure of dependencies from scratch, every time.
In a traditional GNU/Linux system, I guess cargo must cache the results
of these builds somehow, but I don't know how.  In the case of C
libraries, we can just produce .so files that other builds can receive
as inputs, and all is well...but in Rust, it seems hard to do something
like that.  I have to admit, I don't know a lot about this, but based on
what I've heard, it sounds like we would basically have to re-implement
a lot of what Cargo is already doing, in order to get the behavior we
want.  Maybe that's the right path; I don't know.

Ludovic Court=C3=A8s <ludo@gnu.org> writes:

>> I suppose one way to work around some of the issues is to make it so
>> that the crates "build" by copying the source to %out/share/guix-vendor
>> or something.
>
> So the core issue is that there=E2=80=99s nothing like shared libraries, =
is that
> correct?  This, in turn, means that there=E2=80=99s nothing to actually b=
uild,
> and thus a crate doesn=E2=80=99t really map to a package in the usual sen=
se of
> the word, right?
>
> In that case, what you suggest (copying the source in the package
> output) sounds like it could work.  It would be an improvement over what
> we have now: the package graph would correspond to the crate graph.

Yes, you could install the source into the "out" output, or to a
separate output such as "src".  You could also define a bunch of
"proc-macro2-src" and "quote-src" packages that only build the source.
These possibilities sound similar to the plan we were originally
considering.  We discussed that plan here:

  https://debbugs.gnu.org/cgi/bugreport.cgi?bug=3D35155

In short, we hoped to build a crate's source and install it (the source)
into a specific output.  Then for any given crate we want to build, we
would add other crates to its inputs, and the cargo-build-system would
take care of populating the vendor directory with the transitive closure
of source found in the inputs.  However, we couldn't implement that plan
because Guix's current DSL doesn't work with the cycles introduced by
Cargo's "dependencies" and "dev-dependencies".

Perhaps if we defined all crates as "proc-macro2-src" and "quote-src"
etc., we could then define a crate that builds an actual artifact (e.g.,
ripgrep) by dropping the "src" suffix.  So for ripgrep, we'd have
"ripgrep-src" and also "ripgrep".  The former would just copy the source
into the output, and the latter ("ripgrep") would list the former as an
input in order to actually build the program.  In this model, I guess
every "*-src" package would have no inputs and just one output.  I guess
any package that produces an artifact, for example the "ripgrep"
package, would list a bunch of "*-src" packages as inputs: one for every
crate in the transitive closure of dependencies and dev-dependencies of
the ripgrep crate.  That might solve the problem of cyclic dependencies,
and it might reduce (but not eliminate) the amount of excessive building
performed by cargo-build-system.  However, it would make some package
definitions large, it would introduce duplication of inputs across
packages that need the same "*-src" inputs, and it would create a lot of
"*-src" packages.  On the plus side, tools like "guix graph" would work
as-is; currently, "guix graph" has not been taught to understand
#:cargo-inputs and #:cargo-development-inputs for cargo-build-system
packages.

Maybe that really is a better way.  If I'm not mistaken, this seems to
be the direction Efraim has begun to take things with commit
86e443c71d4d19e6f80cad9ca15b9c3a301c738c.  Thank you for taking
initiative to try to improve things, Efraim!  I'm in favor of
experimenting with it to see if it works out.  If it proves to be useful
and we want to stick with it, then we should consider removing the logic
from cargo-build-system that implemented the #:cargo-inputs and
#cargo-development-inputs arguments.

In the end, I just want to be able to use Rust on Guix!

=2D-=20
Chris

--=-=-=
Content-Type: application/pgp-signature; name="signature.asc"

-----BEGIN PGP SIGNATURE-----

iQIzBAEBCAAdFiEEy/WXVcvn5+/vGD+x3UCaFdgiRp0FAl3t0VMACgkQ3UCaFdgi
Rp3LoxAAw/gzbV7PMA5hAn9TXP9ByH8tDrQG5cHQhbYcPdgg6j9tbTALMjLxsw/4
h/5EuQXtdcQ9WpZUYkkp8MrHsYNnVJZ5GV3dQxscROgCX/hONblS0072OKHNbqzh
dHiZiCS8geCWcZXYRL+RRzvqpvOc3tKN+7hsAxIU7tZlXgHRuqWqVObLl+AXO9nl
FhEU4U+iS1xJsq46bnGCGNdrp4PD0hHX2ExlevgIpAqGumIKSVJt3xC/F1VC0ymD
3YDpcw+/h0VxSZX32+Yv+1IahUfhILViWG0AIQ9mmK4QKeDQHmYv5QU3IjuXS4Xh
x8V8uRPb/pRnmtlJuUCuukSs1GNgV06idKPXnVbCCSEK3Yiz1TWPnK2F5YtKkoHy
wXpOtHB6lpW0BBMFlKIamn8K+a2OCusSlFA7FyYif3bv3o75bmsf+EHqyfrxQVQ4
DupVe/9uQxkYP/RbHvQ82sKk5uC/RepjU7SqczR8bRUSsVtc4cgn8AfyAnA7O3kc
VGesWxtFNkqBp5YHs8nucx3tOypo7a/DfWVpOJVDAvV3oNFgrk4pfzaFNZqdzVwj
2vN9DhLsgk4XbYYa4mfv8u/s/YR6wDov9npkV3JxaWm2siS0NgWttzRiXkCIUrUX
yjoXXQSvcOpvZ8Dw950jPH9MBCju9QnFbNyVlhSIVm6u/ytV9sM=
=xVvL
-----END PGP SIGNATURE-----
--=-=-=--