From mboxrd@z Thu Jan 1 00:00:00 1970 From: Efraim Flashner Subject: [BLOG] rust blog post Date: Mon, 25 Nov 2019 12:30:37 +0200 Message-ID: <20191125103037.GJ1124@E5400> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="8SdtHY/0P4yzaavF" Return-path: Received: from eggs.gnu.org ([2001:470:142:3::10]:41145) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iZBeC-0005uE-9E for guix-devel@gnu.org; Mon, 25 Nov 2019 05:31:14 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1iZBeA-0001Mc-Ai for guix-devel@gnu.org; Mon, 25 Nov 2019 05:31:12 -0500 Content-Disposition: inline List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org Sender: "Guix-devel" To: guix-devel@gnu.org --8SdtHY/0P4yzaavF Content-Type: multipart/mixed; boundary="wRokNccIwvMzawGl" Content-Disposition: inline --wRokNccIwvMzawGl Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable There's one or two FIXMEs and the like in the blog post but I figured I'd send it off anyway. --=20 Efraim Flashner =D7=90=D7=A4=D7=A8=D7=99=D7=9D = =D7=A4=D7=9C=D7=A9=D7=A0=D7=A8 GPG key =3D A28B F40C 3E55 1372 662D 14F7 41AA E7DC CA3D 8351 Confidentiality cannot be guaranteed on emails sent or received unencrypted --wRokNccIwvMzawGl Content-Type: text/plain; charset=utf-8 Content-Disposition: attachment; filename=rust-in-guix Content-Transfer-Encoding: quoted-printable It's easy to think of Rust as a new programming language but it has already= been around for five years. Rust has made it past it's 1.0 release and th= e compiler is written in Rust. We even have mrustc to act as a secondary me= thod to bootstrap new Rust releases without falling back to downloading pre= compiled tarballs. So how is the state of Rust in Guix today? Truthfully, Rust in Guix could be better. The developer story for Rust is p= retty straightforward: write your program, declare your dependencies in a C= argo.toml file, and ```cargo foo``` will figure out your dependency chain. = ```cargo build``` will download any missing dependencies, even using a cach= e directory to reduce downloads, and compile the bits of the dependencies t= hat are needed. But what about for distro maintainers? Obviously we can't download dependencies at build time, they need to be pac= kaged ahead of time. So we package those dependencies. But wait, those depe= ndencies have dependencies that are needed, and those ones too. It's depend= encies all the way down, hidden in 5 years of iterative development that we= 're late to the party to, trying to capture snapshots in time where specifi= c versions of libraries built using previous generations. All this all the = way back to the beginning, whenever that is. Obviously humans are prone to errors, so to work around this while packagin= g Rust crates Guix has effectively two importers for crates, one that will = import a specific version and list it's dependencies, and one that can take= a crate and recursively import all the packages that it depends on. Curren= tly some work is needed to allow the recursive importer to interpret versio= n numbers, but for now it works quite well. Taking a break from Rust for a moment, let's look at some of the other lang= uages that are packaged. Packages written in C/C++, processed with autotool= s or cmake or meson, are the easiest. Dependencies are declared, source cod= e is provided, and there's a clear distinction between source code and comp= iled binary; source code is for hacking on, binaries are for executing. The= closest to a middle ground are libraries which allow programs to use featu= res from other programs. In order to use a package, all of its dependencies= must be packaged and the libraries linked. Taking a look at the other end we have Javascript. Javascript is source cod= e, it's ready to be read and hacked on. Javascript is already ready to be r= un, therefor it must be a binary. Its... both? Javascript libraries leave d= istro maintainers in a difficult position. Building Javascript ends up in t= he same problem as we saw with Rust, recursive dependencies all the way dow= n, iterative versions depending on previous ones, and a misty past from whe= nce everything sprang forth, which must be recreated in order to bring us b= ack to the present day. But there's more difficulty, often even after a 'bu= ild' phase has been run and tests have been run on Javascript we're left wi= th unchanged code. Except now it's no longer source, it's a binary... or so= mething. So just what did we build and test? We can worry about Javascript another time, Rust has a clear boundary betwe= en source code and binaries. So how about python? Python is a scripting language and can be run without = being compiled, but it also can be compiled (pre-interpreted?) to bytecode = and installed either locally or globally. That leaves us with source code w= hich can double as a binary, and a bytecode which is clearly a binary. Give= n these two states, we declare the uncompiled version as source code, ignor= e that it can be run as a script except when testing the code, and we never= return to second-guess ourselves. How about Go? Go is another language that defies packaging efforts, primari= ly because build instructions often make use of the HEAD of other git branc= hes, not tagged and released versions. That the names of the libraries are = long and cumbersome is mostly a secondary issue. On the developer side a bi= nary is a ```go build``` away. Go will download missing source and compile = libraries as needed. On a packager side the libraries are carefully gathere= d one by one, precompiled, and placed carefully in a directory hierarchy fo= r use in future builds. What could be a long build of a program is replaced= by an intermediate series of packages where libraries are pre-compiled, an= d at each stage only the new code has to be compiled. For all except the distro maintainer, the similarities are strong between R= ust and Go. In both cases dependencies are downloaded as part of the build = process, there's a cache for the downloaded sources and the compiled librar= ies, and build artifacts can be reused between different programs with over= lapping dependencies. For the distro maintainer many of these similarities = are thrown out. Dependencies are packaged ahead of time and previously pack= aged libraries is literally a cache. Libraries can be reused for other pack= ages, yes, but for Rust they're not. Why not? If they're already compiled why not reuse them? Previously we've discussed source code and compiled binaries (or libraries)= , but in Rust there are two types of libraries. There are dynamic libraries= , packaged as ```libfoo.so```, and there are Rust libraries, packaged as ``= `libfoo.rlib``` or ```libfoo-MAGICHASH.rlib```. When a Rust package declare= s a dependency on a Rust library, it doesn't declare a dependency on the wh= ole library but rather just on the parts that it needs. This means that we = can get away with packaging only a portion of the dependent library, or the= library with only some of its features or its own dependencies. When compi= ling a final binary, a Rust binary doesn't link to an rlib, it takes just t= he part that it needs and incorporates it into the binary. As far as packag= e maintainers are concerned, this isn't ideal but it is something we can li= ve with, we already have this case with static libraries from other languag= es. If we were to compile the binary manually the command would be ```rustc= --binary foo --extern bar=3D/path/to/libbar.rlib``` and we'd continue on. = However, when bar depends on baz, the similar command, ```rust --library ba= r --extern baz=3D/path/to/libbaz.rlib``` _doesn't_ link libbaz to libbar. T= his leaves us in a pickle; we know which libraries we need but we're unable= to compile them individually and build them up iteratively until we reach = the binary endgoal. One of our packaged Rust programs, rust-cbindgen, is used by Icecat. Rust-c= bindgen declares 8 (TODO: check this number) dependencies. When run outside= of the build environment ```cargo build``` downloads a total of 58 (TODO: = check this number) packages, compiles them and produces a binary. Our recur= sive importer created more than 300 new packages before it was told to stop= =2E Returning to our build process for rust libraries, since we couldn't li= nk one rlib to another rlib, we opted to compile one rlib and then place it= s source in the build directory of the next one where it was recompiled. Ba= z would be built, then baz's source would be put in bar's vendor directory = where baz and bar would be built. After this baz's and bar's sources would = be put in foo's vendor directory, where all three would be compiled. This s= ounds like Go, except that we're throwing away all the results of our build= s each time we start a new package. Since we were just copying the sources from package to package, the simples= t solution was to consider the Rust dependants as shared sources and not as= shared libraries. Yes, the same source would be used between multiple prog= rams, but each one package already only took the small portion of the share= d source that it needed so there was no benefit to compiling the entire pac= kage ahead of time, especially with the mounting recursive dependencies, wh= o's compiled libraries were being thrown away anyway. Rust-cbindgen ships with a Cargo.toml listing 8 dependants. It also ships w= ith a Cargo.lock, detailing the 8 dependencies and the bits of other librar= ies that are needed. By packing the sources of the 58 enumerated libraries = and placing them in the vendor directory where the necessary parts could be= compiled we ended at the same place we were headed anyway; only the source= s were propagated from package build to package build, only the source was = the relevant part, only the source is shared. --wRokNccIwvMzawGl-- --8SdtHY/0P4yzaavF Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEoov0DD5VE3JmLRT3Qarn3Mo9g1EFAl3brUoACgkQQarn3Mo9 g1Eqww/6AjdXc2wodHTgk0gYjj9T7CCtw2xaGL+NE2fWba4WAgyJZPpCAgFUzW/f jvQFNaP8Mmb9XuSyPQEyG6Zjq2iPO30xm5Mwnhpi66eJArAr3mxG4cySG6+e7pQb BF2mt4w9unhI1p9IOqo0Vq1Hb2NP40B4P1x3D/xZU4o07MZiLisKRXU35uedXeeo YOKQDwHyy3Uywbpdg/wExm6Nw70hKdWiccRXb0EPVIofhu7BURhNl4axaPuQ8f9N NWoNCwHHV29fYCVrBOMJNh0Eai6kooA8csjLEogdROAWC17qYKbN8HwAowmUl+O8 l5a/vLD6tino0H84W5b0xjrQbf3+iQXinpOiaWwsAtHCy4eHe3jRJ/XeQu2I/35K qHqZR6ADHxzIH85eGfeFl5jdh5bZpbwNFYX4+sOHTjkokBBeD2P9f3XgnBkK0Cb4 jPKqAHFG/vIqNf+dx6BGXMjVvr7s97hF/rQbCceuGVNQGqxX+Y9z8YgFxmlngsU7 m/cG7HXP42h0X+ajQnSqbt417yc6TmS/z+YdL54g0Wfn7II38dTqy19T6TlNX4+N bHjyde81Qm4bcwPG8B1k/zaH8IoAVIAM/lQ2KJVF3Aolux7m9o5N907XvhgCpBMs 5TiAIjww8aHav5rVuAgHToJnYs3JuUC7lyP2oR3qo/CeRUkh6lA= =JWCk -----END PGP SIGNATURE----- --8SdtHY/0P4yzaavF--