From mboxrd@z Thu Jan 1 00:00:00 1970 From: Efraim Flashner Subject: Re: [BLOG] rust blog post Date: Tue, 26 Nov 2019 12:27:37 +0200 Message-ID: <20191126102737.GK1124@E5400> References: <20191125103037.GJ1124@E5400> <87a78jm01a.fsf@ambrevar.xyz> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha512; protocol="application/pgp-signature"; boundary="jdM5ZcN/ZcXXVwZs" Return-path: Received: from eggs.gnu.org ([2001:470:142:3::10]:59362) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1iZY5K-0006CL-IH for guix-devel@gnu.org; Tue, 26 Nov 2019 05:28:48 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1iZY5I-0004Un-MR for guix-devel@gnu.org; Tue, 26 Nov 2019 05:28:42 -0500 Received: from flashner.co.il ([178.62.234.194]:49602) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1iZY5I-0004J8-D4 for guix-devel@gnu.org; Tue, 26 Nov 2019 05:28:40 -0500 Content-Disposition: inline In-Reply-To: <87a78jm01a.fsf@ambrevar.xyz> List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: guix-devel-bounces+gcggd-guix-devel=m.gmane.org@gnu.org Sender: "Guix-devel" To: Pierre Neidhardt Cc: guix-devel@gnu.org --jdM5ZcN/ZcXXVwZs Content-Type: multipart/mixed; boundary="L/bWm/e7/ricERqM" Content-Disposition: inline --L/bWm/e7/ricERqM Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable Hopefully this is better. I added a new line between each paragraph On Tue, Nov 26, 2019 at 10:58:41AM +0100, Pierre Neidhardt wrote: > I think the attachment broke the formatting of the file (there is no > paragraph break). Could you resend it? >=20 --=20 Efraim Flashner =D7=90=D7=A4=D7=A8=D7=99=D7=9D = =D7=A4=D7=9C=D7=A9=D7=A0=D7=A8 GPG key =3D A28B F40C 3E55 1372 662D 14F7 41AA E7DC CA3D 8351 Confidentiality cannot be guaranteed on emails sent or received unencrypted --L/bWm/e7/ricERqM Content-Type: text/plain; charset=utf-8 Content-Disposition: attachment; filename=rust-in-guix Content-Transfer-Encoding: quoted-printable It's easy to think of Rust as a new programming language but it has already= been around for five years. Rust has made it past it's 1.0 release and th= e compiler is written in Rust. We even have mrustc to act as a secondary me= thod to bootstrap new Rust releases without falling back to downloading pre= compiled tarballs. So how is the state of Rust in Guix today? Truthfully, Rust in Guix could be better. The developer story for Rust is p= retty straightforward: write your program, declare your dependencies in a C= argo.toml file, and ```cargo foo``` will figure out your dependency chain. = ```cargo build``` will download any missing dependencies, even using a cach= e directory to reduce downloads, and compile the bits of the dependencies t= hat are needed. But what about for distro maintainers? Obviously we can't download dependencies at build time, they need to be pac= kaged ahead of time. So we package those dependencies. But wait, those depe= ndencies have dependencies that are needed, and those ones too. It's depend= encies all the way down, hidden in 5 years of iterative development that we= 're late to the party to, trying to capture snapshots in time where specifi= c versions of libraries built using previous generations. All this all the = way back to the beginning, whenever that is. Obviously humans are prone to errors, so to work around this while packagin= g Rust crates Guix has effectively two importers for crates, one that will = import a specific version and list it's dependencies, and one that can take= a crate and recursively import all the packages that it depends on. Curren= tly some work is needed to allow the recursive importer to interpret versio= n numbers, but for now it works quite well. Taking a break from Rust for a moment, let's look at some of the other lang= uages that are packaged. Packages written in C/C++, processed with autotool= s or cmake or meson, are the easiest. Dependencies are declared, source cod= e is provided, and there's a clear distinction between source code and comp= iled binary; source code is for hacking on, binaries are for executing. The= closest to a middle ground are libraries which allow programs to use featu= res from other programs. In order to use a package, all of its dependencies= must be packaged and the libraries linked. Taking a look at the other end we have Javascript. Javascript is source cod= e, it's ready to be read and hacked on. Javascript is already ready to be r= un, therefor it must be a binary. Its... both? Javascript libraries leave d= istro maintainers in a difficult position. Building Javascript ends up in t= he same problem as we saw with Rust, recursive dependencies all the way dow= n, iterative versions depending on previous ones, and a misty past from whe= nce everything sprang forth, which must be recreated in order to bring us b= ack to the present day. But there's more difficulty, often even after a 'bu= ild' phase has been run and tests have been run on Javascript we're left wi= th unchanged code. Except now it's no longer source, it's a binary... or so= mething. So just what did we build and test? We can worry about Javascript another time, Rust has a clear boundary betwe= en source code and binaries. So how about python? Python is a scripting language and can be run without = being compiled, but it also can be compiled (pre-interpreted?) to bytecode = and installed either locally or globally. That leaves us with source code w= hich can double as a binary, and a bytecode which is clearly a binary. Give= n these two states, we declare the uncompiled version as source code, ignor= e that it can be run as a script except when testing the code, and we never= return to second-guess ourselves. How about Go? Go is another language that defies packaging efforts, primari= ly because build instructions often make use of the HEAD of other git branc= hes, not tagged and released versions. That the names of the libraries are = long and cumbersome is mostly a secondary issue. On the developer side a bi= nary is a ```go build``` away. Go will download missing source and compile = libraries as needed. On a packager side the libraries are carefully gathere= d one by one, precompiled, and placed carefully in a directory hierarchy fo= r use in future builds. What could be a long build of a program is replaced= by an intermediate series of packages where libraries are pre-compiled, an= d at each stage only the new code has to be compiled. For all except the distro maintainer, the similarities are strong between R= ust and Go. In both cases dependencies are downloaded as part of the build = process, there's a cache for the downloaded sources and the compiled librar= ies, and build artifacts can be reused between different programs with over= lapping dependencies. For the distro maintainer many of these similarities = are thrown out. Dependencies are packaged ahead of time and previously pack= aged libraries is literally a cache. Libraries can be reused for other pack= ages, yes, but for Rust they're not. Why not? If they're already compiled why not reuse them? Previously we've discussed source code and compiled binaries (or libraries)= , but in Rust there are two types of libraries. There are dynamic libraries= , packaged as ```libfoo.so```, and there are Rust libraries, packaged as ``= `libfoo.rlib``` or ```libfoo-MAGICHASH.rlib```. When a Rust package declare= s a dependency on a Rust library, it doesn't declare a dependency on the wh= ole library but rather just on the parts that it needs. This means that we = can get away with packaging only a portion of the dependent library, or the= library with only some of its features or its own dependencies. When compi= ling a final binary, a Rust binary doesn't link to an rlib, it takes just t= he part that it needs and incorporates it into the binary. As far as packag= e maintainers are concerned, this isn't ideal but it is something we can li= ve with, we already have this case with static libraries from other languag= es. If we were to compile the binary manually the command would be ```rustc= --binary foo --extern bar=3D/path/to/libbar.rlib``` and we'd continue on. = However, when bar depends on baz, the similar command, ```rust --library ba= r --extern baz=3D/path/to/libbaz.rlib``` _doesn't_ link libbaz to libbar. T= his leaves us in a pickle; we know which libraries we need but we're unable= to compile them individually and build them up iteratively until we reach = the binary endgoal. One of our packaged Rust programs, rust-cbindgen, is used by Icecat. Rust-c= bindgen declares 8 (TODO: check this number) dependencies. When run outside= of the build environment ```cargo build``` downloads a total of 58 (TODO: = check this number) packages, compiles them and produces a binary. Our recur= sive importer created more than 300 new packages before it was told to stop= =2E Returning to our build process for rust libraries, since we couldn't li= nk one rlib to another rlib, we opted to compile one rlib and then place it= s source in the build directory of the next one where it was recompiled. Ba= z would be built, then baz's source would be put in bar's vendor directory = where baz and bar would be built. After this baz's and bar's sources would = be put in foo's vendor directory, where all three would be compiled. This s= ounds like Go, except that we're throwing away all the results of our build= s each time we start a new package. Since we were just copying the sources from package to package, the simples= t solution was to consider the Rust dependants as shared sources and not as= shared libraries. Yes, the same source would be used between multiple prog= rams, but each one package already only took the small portion of the share= d source that it needed so there was no benefit to compiling the entire pac= kage ahead of time, especially with the mounting recursive dependencies, wh= o's compiled libraries were being thrown away anyway. Rust-cbindgen ships with a Cargo.toml listing 8 dependants. It also ships w= ith a Cargo.lock, detailing the 8 dependencies and the bits of other librar= ies that are needed. By packing the sources of the 58 enumerated libraries = and placing them in the vendor directory where the necessary parts could be= compiled we ended at the same place we were headed anyway; only the source= s were propagated from package build to package build, only the source was = the relevant part, only the source is shared. --L/bWm/e7/ricERqM-- --jdM5ZcN/ZcXXVwZs Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEoov0DD5VE3JmLRT3Qarn3Mo9g1EFAl3c/hYACgkQQarn3Mo9 g1HnjRAAlRWWr+J5NDmngyoNSkA2ygxuAMQ7HG1H8GJWb+03BEUPXYxBs+81tr5i 4pwUQuQAocNyNXYqiPEyrLhp6zbucSrMqKXJya7HhaLD9Lq4WifPCx1MbUU3Nu52 S7K3y8fqba4lgByzIzZh02hNE8VQLMTY080KH+x0tIwW5+3WXthQJMGpHd2wIzLz 5Tkufyd8OiHSyxAasPUI36AWAqBvzifKFPVYvpXQXke92qgeEnAOJDPPktXiHJ9j sFutNo/8ieHwa2uCD0wZ6XjJNKXIYlB825BC8dRcdcvIFdHkvRRrGLrve6vg5WcQ /jMnDB/RG/WxrKNN98lSrwlluxAC0VlGA6lg54FjWvGAMzNO6v9nKRCdl97ElRSi poM8FAE6Tw3ErG8QJS1+gfRyGXulGhVmKEbnkQzozIokgJqRHN/hAVGF3Ob/143l +GHevNOVHWk7dwlrYdiD8V6eSocBI6GoqoXuCxQja5wLnQM+PEpEKbgD0cilpgtN w2pAvAcssLUEEEJEs+pY3Tugyr01CrSuc4XB5kkkBsaVwK6e3kBamQWAAtmBHhqL 9fhQtQwwBI+NQN5k7kSqll5ArJDDeCz5nCOXCKSR5OYpy5naCp1RKgThLNe783iB 0k7t9Xmta5lq3/Mq9UuTq4Og7UzsDL/Oi/Ytl14jGWRj57Zu1ks= =Qf2m -----END PGP SIGNATURE----- --jdM5ZcN/ZcXXVwZs--