From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mp2 ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by ms11 with LMTPS id mKjtDScPOF8RbAAA0tVLHw (envelope-from ) for ; Sat, 15 Aug 2020 16:36:55 +0000 Received: from aspmx1.migadu.com ([2001:41d0:2:4a6f::]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) by mp2 with LMTPS id WKTCCScPOF/hEQAAB5/wlQ (envelope-from ) for ; Sat, 15 Aug 2020 16:36:55 +0000 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by aspmx1.migadu.com (Postfix) with ESMTPS id 78CF29400C7 for ; Sat, 15 Aug 2020 16:36:54 +0000 (UTC) Received: from localhost ([::1]:50376 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1k6zAr-0000X6-BP for larch@yhetil.org; Sat, 15 Aug 2020 12:36:53 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:35302) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1k6zAe-0000X0-Mw for guix-devel@gnu.org; Sat, 15 Aug 2020 12:36:40 -0400 Received: from world.peace.net ([64.112.178.59]:33330) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1k6zAb-00064N-V1 for guix-devel@gnu.org; Sat, 15 Aug 2020 12:36:40 -0400 Received: from mhw by world.peace.net with esmtpsa (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1k6pIo-0004zc-8F; Sat, 15 Aug 2020 02:04:26 -0400 From: Mark H Weaver To: Alexandre Oliva Subject: Re: Linux-libre 5.8 and beyond In-Reply-To: References: <87d03vv0nm.fsf@netris.org> Date: Sat, 15 Aug 2020 02:03:27 -0400 Message-ID: <875z9kv41h.fsf@netris.org> MIME-Version: 1.0 Content-Type: text/plain Received-SPF: pass client-ip=64.112.178.59; envelope-from=mhw@netris.org; helo=world.peace.net X-detected-operating-system: by eggs.gnu.org: First seen = 2020/08/15 12:36:26 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: guix-devel@gnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: "Development of GNU Guix and the GNU System distribution." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: guix-devel@gnu.org, Jason Self Errors-To: guix-devel-bounces+larch=yhetil.org@gnu.org Sender: "Guix-devel" X-Scanner: scn0 Authentication-Results: aspmx1.migadu.com; dkim=none; dmarc=none; spf=pass (aspmx1.migadu.com: domain of guix-devel-bounces@gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=guix-devel-bounces@gnu.org X-Spam-Score: -1.01 X-TUID: BwfYlCBXd5S+ Hi Alexandre, Alexandre Oliva wrote: > On Aug 12, 2020, Mark H Weaver wrote: > >>>> It may be useful for users with newer hardware devices, which are >>>> not yet well supported by the latest stable release, to use an >>>> arbitrary commit from either Linus' mainline git repository or some >>>> other subsystem tree. >>> >>> The cleaning up scripts are version-specific and won't work on an >>> "arbitrary commit from Linus's mainline git repository" (i.e., someone >>> wanting to get today's most recent commit going into 5.9.) The scripts >>> would fall over and die in such a scenario, > >> Okay, perhaps this was wishful thinking on my part. > > Yup. If you ran a deblob-check in verify mode on the resulting > tarballs, you'd see how error-prone this is. You'd at least stop > non-Free code from silently sneaking in and finding its way into running > on users' machines. That's the *least* someone who runs the > deblob-scripts on their own should do to smoke-test the result WRT > *known* freedom issues. What is this "verify mode" that you're referring to, and where is it documented? The word "verify" does not occur in either of the deblob scripts that I know about, namely "deblob-" and "deblob-check". The string "verif" occurs a few times, but nothing related to the script functionality. I don't see anything like a verification mode mentioned in the options documented at the top of those two scripts. For the record, it was not my intent to skip any automated checking provided by these scripts. If we're running the scripts in a suboptimal way, please tell me a better way. FYI, right now we're simply running the main 'deblob-' script with no arguments in the unpacked Linux source directory, with the corresponding 'deblob-check' script in $PATH and $PYTHON pointing to python 2.x. If 'deblob-' exits abnormally or with a non-zero result, the Guix build process fails. Last I checked, 'deblob-check' was certainly being run by 'deblob-' as a subprocess, because I had to make several substitutions of hard-coded paths before it would work in Guix (e.g. /bin/sed and /usr/bin/python). >> I had hoped that the deblob scripts would typically mostly work, even >> if they weren't able to do a comprehensive cleaning. > > I'd honestly hope for a much higher standard than that for a > FSDG-compliant distro, especially one that carries the GNU mark. As I wrote below: >> I would oppose adding such a partly-cleaned kernel to Guix itself, With this in mind, your accusation above is not relevant to Guix. Above, I was talking about my hope to enable users, *on their own machines* and using *their own private build recipes*, to make a best-effort deblobbing of a non-standard kernel variant that they need to use for whatever reason. If they aren't provided with that option, the obvious alternative (which I expect 99% of such users would do anyway) is to simply run a fully-blobbed kernel instead. > But you don't! That's what you get when you jump the gun and use > outdated cleaning up scripts, without waiting for us to verify, > update and release them for a newer version. Here you are conflating two substantially different scenarios: (1) Attempting to use your deblob scripts on a newer kernel that almost certainly includes many new drivers and blobs that aren't detected by your scripts. That's the case that I said I would oppose for inclusion in Guix. (2) Using the deblob scripts made for 5.4.57 on a 5.4.58 kernel in order to apply security fixes more quickly, and where the probability of uncleaned new blobs is quite low. >> but I wanted to enable users who need to use some other branch of >> Linux on their own systems to make a best-effort cleaning. > > Besides the likelihood of something going wrong, that seems like a > backwards goal for a distro that is not expected to as much as point > users at a non-Free package. It's *not* a goal for Guix, and it wasn't even my motivation for teaching Guix to run the Linux-libre deblob scripts. It's just something that, on a whim, I chose to include in my list of possible advantages to having such functionality, nothing more. > I'm sure that's not what you intend, but this arrangement, plus your > mention of hurriedly getting releases out, adds up to an incentive to > disable the deblobbing so as to get a faster build. I don't understand how you reached this conclusion. As far as I can tell, changing Guix to run the deblob scripts made *no* difference to what someone would have to do to ask Guix to build fully-blobbed kernel. > I hope you'll agree that this is undesirable. Agreed. >> In my experience, the deblob scripts are very rarely changed after the >> first few point releases of a stable release series. > > My personal experience tells me otherwise. 5.7 had only one update at > .8; 5.6, at .6 and .16; 5.5, at .3, .11 and .19; 5.4, at .14, .18, .27, > .34 and .44; 5.3, at .4 and .11; 5.2 at .1, .3 and .11; 5.1 at .2, .18 > and .20; 5.0 at .7 and .16. What you describe was true only of 4.17, > 4.10, 4.3, 3.13, 3.5, and 3.2, i.e. 6 out of the 50 major releases > starting at 3.0. I only checked your claims regarding 5.4, and found that you're mistaken about them being updated in 5.4.44. In fact, the 'deblob-5.4' and 'deblob-check' files, as found in /pub/linux-libre/releases/, have not changed since version 5.4.34. Moreover, of the 4 deblob updates (.14, .18, .27, and .34) that have *actually* been made so far during the 5.4.x series, IIUC only one of them declared new blobs to remove, namely the update for 5.4.27. The 5.4.14 update only removed extraneous backslashes in existing regexps, changing "\e" to "e" and "\@" to "@". I don't know whether these extraneous backslashes caused blobs to be included in the linux-libre tarballs, but if so, that presumably already happened in 5.4.13 and would have happened even if we had used your official tarballs, no? The 5.4.18 and 5.4.34 updates only added new 'accept' directives. I guess that means that temporarily omitting these additions wouldn't cause new blobs to be included, is that right? >> I know this because I always check for updates to the deblob scripts >> whenever I update linux-libre in Guix. In practice, the deblob scripts used by >> Guix are never more than 1 or 2 micro versions behind the version of >> Linux they are applied to. > > There have been 61 script updates for the 1274 4.*.*-gnu* and 5.*.*-gnu* > stable releases, so Guix has shipped potentially non-FSDG code, that > *would* have been flagged by deblob-check on the tarballs, at between 5% > and 10% of these releases. Does that sound like a good standard for a > freedom-first distro to aim for? If it were true that we've been including blobs in 5-10% of our linux-libre releases, I agree that would be a serious problem. However, I believe your estimates are way off, so I took a closer look at the statistics for the 5.4, 4.19, and 4.14 kernels. I already wrote about 5.4 above. If we include only the deblob updates that added checks for new blobs, it's only happened once in 58 upstream updates, i.e. for 1.7% of the updates. In the 4.19 series, although the deblob scripts have been updated 8 times, of those 8, 3 only add 'accept' directives, and a fourth only makes the same regexp fixes mentioned above ("\e" -> "e" and "\@" -> "@"). In other words, only 4 of these deblob updates might result in new blobs being recognized. So that's 4 new blob updates out of 139 upstream updates, which comes out to 2.9%. In the 4.14 series, the deblob scripts were updated 6 times, but 3 only add 'accept' directives and a fourth only makes the regexp fix. So that comes out to 2 new blob updates out of 193 upstream updates, which comes out to 1.0%. So, unless I missing something, it's more accurate to say that when I push a Linux-libre security update before waiting for you to bless it, I'm taking a 1-3% risk that a blob might end up in the result. I find that level of risk undesirable. I would certainly rather avoid it. I guess where you and I differ is that I *also* find it undesirable to subject our users to unnecessary delays in getting these security updates, because that *also* carries a risk, namely the risk that their systems will be compromised due to a delayed security update. To my mind, it makes sense to balance these two risks, especially since we know that it's simply impractical to completely eliminate the risk of non-FSDG-compliant code occasionally finding its way into Guix. >>> The moment that the Linux-libre project determines that scripts are >>> suitable is the moment that the new cleaned-up release is ready to >>> publish in git and the appropriate tags will then appear in git. The >>> compressed tarballs come some time later. > >> I prefer to avoid unnecessary delays when applying micro kernel updates, > > Sorry, but it doesn't look like you do. If you did, you would be taking > a cleaned up tree instead of re-deblobbing it. I'm not concerned about another 30 minutes (or whatever) to run the deblob scripts, especially if the alternative is to trust the integrity of your machines unnecessarily. The delays I'd prefer to avoid are ones measured in tens of hours, which is occasionally how long it takes before Linux-libre reacts to a new upstream update. > You skip even the automated verification we do, which saves you some > time, but at what price? As I wrote above, if there's some automated verification that we are failing to do, please tell me how to do it. It was certainly not my intent to skip any such verification. >> I also consider it unwise for all of us, as a matter of habit or policy, >> to trust the integrity of the computer systems used by the Linux-libre >> project to perform the deblobbing. > > I welcome double-checking of our cleaning up at all levels, but why are > you setting a higher trust standard for us than for a project known to > be at odds with our shared goals, such as Linux? I don't understand how you reached the conclusion that I'm setting a higher trust standard for Linux-libre than for Linux. The principle I'm following here is simply to avoid relying on the integrity of any system if I can easily avoid it. In particular, if I can easily run an automated process on my own machine instead of relying on some other system to provide pre-generated outputs for me, then I prefer to do it myself. > You don't apply the patches that went into it since the last known > good release to double-check their releases, do you? For most > projects, you just take their tarballs or tags and build it. That's true, and I agree that it's something we could improve on. It would be preferable to fetch from a git repository instead, and preferably one that has a lot of eyes on it. > For Linux-libre, you start from (untrustworthy) Linux, run the > (presumed untrustworthy) cleaning up scripts, and blindly trust the > result. I agree that we cannot avoid trusting many people and systems, and that in most cases that trust is blind. In this case, we cannot avoid trusting the Linux source code (even if we download exclusively from the Linux-libre project), and we cannot avoid trusting the Linux-libre deblob scripts. However, I reject the argument that because we must trust X and Y, we might as well trust Z as well. > There's no self-verification run with deblob-check, Again, if we're failing to do that, it's a bug that has not previously been brought to my attention. See above. > no compare with our release, nothing. If you were to test the > integrity of our releases, you'd think you'd at least look at them. I *did* compare with your releases when I first taught Guix how to run the deblob scripts, but not since then. Anyway, I fail to see the relevance of this fact. I agree that it would be useful for someone running Guix to compare our generated tarballs to yours. There are millions of useful things I *could* do with my time, but alas, my energies are limited. > If you were to test the integrity of our releases, you'd think you'd at > least look at them. Starting from a known-good Linux release and > applying patches to double-check the results is expensive, so it makes > sense to do that only occasionally, rather than as part of every build. > Deblobbing and checking the result is also expensive, so it also makes > sense for you to do so only occasionally, rather than as part of every > build. As far as I can tell, the vast majority of Guix users use substitutes provided by its build farm. I guess that it's fairly rare for people to build everything on their own machines, as I do. > But the point stands that, for someone who'd rather trust no one, you're > blindly trusting both Linux and Linux-libre. The former when it comes > to base releases you don't check; the latter when it comes to scripts > whose results you hardly even look at. Why not reduce your trust base > to just Linux-libre, That's not possible. Clearly, you do not have the capacity to audit all of the code that Linux produces. Therefore, by trusting Linux-libre, we must implicitly also trust the Linux project. That much we cannot avoid. We also cannot avoid trusting your deblob scripts. However, we *can* easily avoid trusting the integrity of the systems that you use to run the deblob scripts. > and treat is as a citizen of the same class as > nearly every other project you build, and satisfy your trust-but-verify > needs looking into what changes between one of our releases and another? You seem to be suggesting that I'm treating Linux-libre with less respect than other projects in Guix. I reject that claim. In fact, I strongly support reducing Guix's reliance on pre-generated outputs produced by *any* project. I'm not singling out the Linux-libre project here. For example, one of the things I'm recently been thinking about is that Guix currently trusts the integrity of all the scripts generated by autoconf/automake/libtool/etc for most of the tarballs that we download. Those scripts are generated on random developer machines, and they are very difficult to reproduce, because they depend on the precise versions of many other packages, and Debian also seems to have extensively modified their automake, leading to other differences. I would be in favor of working toward generating those scripts in Guix itself where possible, but it's a big job and likely to cause maintenance headaches. For another example, I also taught Guix how to generate the IceCat source tarball from the corresponding Firefox tarball, and I intend to keep it that way, although I'm currently an IceCat maintainer. >> One question: Would it solve the problem that I mentioned in my earlier >> email, namely the problem of how to determine which precise commit >> introduced a regression between two stable kernel releases? > > No. There are much better (faster and less risky) ways to tend to that > requirement, see #bisecting below. [...] > #bisecting > > You can even take one of our releases and apply the patches that went > into the next upstream stable release, and check that what you get > matches our own corresponding release. Some 98% of the time, they will > be exact matches. Occasionally, there will be a difference, and then > you'll likely find a corresponding change in the deblobbing scripts, or > a preexisting pattern that caused the change. We do this for every > release, as part of our pre-release checks, and you're welcome to do so > as well, and to use the resulting tree to bisect problems. I agree that this would be faster, but I fail to see how it's "less risky" than running the deblob scripts meant for Linux-libre X.Y.Z on a git checkout of the upstream stable git repo between X.Y.(Z-1) and X.Y.Z. More importantly, it's a much less straightforward thing to implement. In the current implementation, we get the ability to deblob arbitrary git commits from the same stable branch essentially for free. I guess you're suggesting that I should implement a radically different mechanism specifically for this purpose, that extracts the individual patches from the upstream stable git repository, attempt to apply them to the base Linux-libre release, compare that to the next Linux-libre release, and then implement my own bisection functionality. If I were to implement this, what would you suggest I do if the patches fail to apply, or if the result fails to match the next Linux-libre release? Thanks, Mark