* Re: Linux-libre 5.8 and beyond @ 2020-08-09 20:15 Jason Self 2020-08-13 0:39 ` Mark H Weaver 0 siblings, 1 reply; 17+ messages in thread From: Jason Self @ 2020-08-09 20:15 UTC (permalink / raw) To: guix-devel [-- Attachment #1: Type: text/plain, Size: 2303 bytes --] > the linux-libre project periodically deletes most of its older > tarballs, even if there are no accidents. Just FYI that git://linux-libre.fsfla.org/releases.git was created mainly to solve that problem. Versions are now pretty much permanent. > It may be useful for users with newer hardware devices, which are > not yet well supported by the latest stable release, to use an > arbitrary commit from either Linus' mainline git repository or some > other subsystem tree. The cleaning up scripts are version-specific and won't work on an "arbitrary commit from Linus's mainline git repository" (i.e., someone wanting to get today's most recent commit going into 5.9.) The scripts would fall over and die in such a scenario, or if forced to continue by using --force the result would be incomplete cleaning. Using the scripts on a version other than what the precise version that they were intended for can also cause them to fail in obscure ways, as Vagrant Cascadian has found out firsthand by running the 5.7 cleaning scripts on 5.8 (that was determined to be the source of the problems they were having.) If you look closely at the results of Vagrant Cascadian's attempt, you'll see there was more than syntax errors: plenty of blobs were certainly left in. Thus: As said, the clean up scripts can only be used for the version that they were intended. Use with any other version invites problems. > It allows us to update to a new point version (which usually > includes security fixes) more quickly, before the linux-libre > project reacts. Any attempt outrun the Linux-libre project and get updates out sooner is unwise. While major new kernel releases will definitely require updates to the cleanup scripts, even minor patched versions occasionally require changes too. Updating to a new version prior to the Linux-libre project having had time to review that new version and determine if any updates are needed to the scripts risks introducing freedom problems in the corresponding Guix version. The moment that the Linux-libre project determines that scripts are suitable is the moment that the new cleaned-up release is ready to publish in git and the appropriate tags will then appear in git. The compressed tarballs come some time later. [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 801 bytes --] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Linux-libre 5.8 and beyond 2020-08-09 20:15 Linux-libre 5.8 and beyond Jason Self @ 2020-08-13 0:39 ` Mark H Weaver 2020-08-13 16:47 ` Linux-libre git repository Vagrant Cascadian 2020-08-14 13:47 ` Linux-libre 5.8 and beyond Alexandre Oliva 0 siblings, 2 replies; 17+ messages in thread From: Mark H Weaver @ 2020-08-13 0:39 UTC (permalink / raw) To: Jason Self; +Cc: guix-devel Hi Jason, I didn't see your email until just now. I read this list only sporadically, so it's best to keep me in the CC list for messages that you'd like me to see, or that are responses to me. Mark H Weaver <mhw@netris.org> wrote: >> the linux-libre project periodically deletes most of its older >> tarballs, even if there are no accidents. Jason Self <jason@bluehome.net> responded: > Just FYI that git://linux-libre.fsfla.org/releases.git was created > mainly to solve that problem. Versions are now pretty much permanent. That's helpful, thanks. I didn't know about this. Out of curiosity, is this git repository advertised anywhere? I wasn't able to easily find it on <https://www.fsfla.org/ikiwiki/selibre/linux-libre/>, but I didn't look carefully, perhaps I missed it. One question: Would it solve the problem that I mentioned in my earlier email, namely the problem of how to determine which precise commit introduced a regression between two stable kernel releases? If not, I think that justifies the machinery that Guix includes to do the deblobbing itself. >> It may be useful for users with newer hardware devices, which are >> not yet well supported by the latest stable release, to use an >> arbitrary commit from either Linus' mainline git repository or some >> other subsystem tree. > > The cleaning up scripts are version-specific and won't work on an > "arbitrary commit from Linus's mainline git repository" (i.e., someone > wanting to get today's most recent commit going into 5.9.) The scripts > would fall over and die in such a scenario, Okay, perhaps this was wishful thinking on my part. I had hoped that the deblob scripts would typically mostly work, even if they weren't able to do a comprehensive cleaning. I would oppose adding such a partly-cleaned kernel to Guix itself, but I wanted to enable users who need to use some other branch of Linux on their own systems to make a best-effort cleaning. >> It allows us to update to a new point version (which usually >> includes security fixes) more quickly, before the linux-libre >> project reacts. > > Any attempt outrun the Linux-libre project and get updates out sooner > is unwise. While major new kernel releases will definitely require > updates to the cleanup scripts, even minor patched versions > occasionally require changes too. Updating to a new version prior to > the Linux-libre project having had time to review that new version and > determine if any updates are needed to the scripts risks introducing > freedom problems in the corresponding Guix version. In my experience, the deblob scripts are very rarely changed after the first few point releases of a stable release series. I know this because I always check for updates to the deblob scripts whenever I update linux-libre in Guix. In practice, the deblob scripts used by Guix are never more than 1 or 2 micro versions behind the version of Linux they are applied to. > The moment that the Linux-libre project determines that scripts are > suitable is the moment that the new cleaned-up release is ready to > publish in git and the appropriate tags will then appear in git. The > compressed tarballs come some time later. I prefer to avoid unnecessary delays when applying micro kernel updates, because I assume that many of the fixes are potentially security fixes (although they are rarely marked as such because upstream does not attempt to determine the security relevance of most fixes, which is reasonable). I also consider it unwise for all of us, as a matter of habit or policy, to trust the integrity of the computer systems used by the Linux-libre project to perform the deblobbing. It's not that I doubt the competence of those people who maintain or administer those systems; it's that I think it's unwise to trust *any* computer system that we can easily avoid trusting. Personally, I don't consider any modern civilian computer system to be trustworthy, and especially not one that paints a target on its back by being a potential vector for compromising the machines of large numbers of users. Enabling users to run the Linux-libre deblob scripts on their own computers (as I do; I *never* use substitutes) enables them to remove one computer system from the set of systems that they must trust. I think that's a good thing. Regards, Mark ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Linux-libre git repository 2020-08-13 0:39 ` Mark H Weaver @ 2020-08-13 16:47 ` Vagrant Cascadian 2020-08-14 0:03 ` Jason Self 2020-08-14 14:03 ` Danny Milosavljevic 2020-08-14 13:47 ` Linux-libre 5.8 and beyond Alexandre Oliva 1 sibling, 2 replies; 17+ messages in thread From: Vagrant Cascadian @ 2020-08-13 16:47 UTC (permalink / raw) To: Mark H Weaver, Jason Self; +Cc: guix-devel [-- Attachment #1: Type: text/plain, Size: 4903 bytes --] On 2020-08-12, Mark H Weaver wrote: > Mark H Weaver <mhw@netris.org> wrote: >>> the linux-libre project periodically deletes most of its older >>> tarballs, even if there are no accidents. > > Jason Self <jason@bluehome.net> responded: >> Just FYI that git://linux-libre.fsfla.org/releases.git was created >> mainly to solve that problem. Versions are now pretty much permanent. > > That's helpful, thanks. I didn't know about this. Out of curiosity, is > this git repository advertised anywhere? I wasn't able to easily find > it on <https://www.fsfla.org/ikiwiki/selibre/linux-libre/>, but I didn't > look carefully, perhaps I missed it. News item for 2020-05-31 mentions it, but clearly it should be more prominently displayed or documented. > One question: Would it solve the problem that I mentioned in my earlier > email, namely the problem of how to determine which precise commit > introduced a regression between two stable kernel releases? If not, I > think that justifies the machinery that Guix includes to do the > deblobbing itself. The granularity appears to be at the level of released tags. I see tags with one commit per release, with an independent history from previous versions; I *think* git bisect wouldn't work without some manual fiddling and you'd have to manually bisect based on version. I tried a quick experiment using the linux-libre git repository to build a package for arm64 in gnu/packages/linux.scm: (define-public linux-libre-fsfla-git-arm64-generic (let* ((version "5.8.1-gnu") (source (origin (method git-fetch) (uri (git-reference (url "git://linux-libre.fsfla.org/releases.git") (commit (string-append "sources/v" version)))) (file-name (git-file-name "linux-libre-fsfla-git" version)) (sha256 (base32 "05v2l4r34nbkv6wpgrzydlb0fkpswpvzdya9vx30wap3n9a9wp6n")) (patches (list %boot-logo-patch %linux-libre-arm-export-__sync_icache_dcache-patch))))) (make-linux-libre* version source '("aarch64-linux") #:defconfig "defconfig" #:extra-version "fsfla-git-arm64-generic"))) The source checkout was quite slow to download, and took up ~1GB in the store once completed. I'm not sure how guix's git origin works exactly; if it downloads the entire git history even to perform a shallow checkout of a single commit, and then throws out the git history? It did appear to be calling git with flags to perform a shallow checkout. It certainly was slower than downloading a compressed tarball. The de-duplication of /gnu/store might still be beneficial if you have significantly more than ~10 versions in /gnu/store, as not every file changes with every release, but overall using compressed tarballs seems to be faster to download and extract even on a slow machine. This partly points to challenges with guix's handling of git repositories, exacerbated by larger git repositories. It would be more viable if there was some way to cache git results such as running "git clone --bare ~/.cache/guix/..." if not present, and "git fetch origin" if present and then populating the store from cached git repository, much like done with "guix pull" ... Surely this has been brought up before? Maybe this breaks the purity of guix's functional paradigm, but arguably no more than a caching http proxy really. It is also possible to retrieve tarballs directly from linux-libre git tags, though I know at least projects hosted on github this does occasionally result in non-identical tarballs. Not sure what factors might trigger this, other than changing tags, but possibly different git versions, tar versions and flags, and compression tool versions and optimizations could be a factor. Reproducible builds has documented some potential causes: https://reproducible-builds.org/docs/archives/ There are also the released linux-libre tarballs, though that may have the persistence issue previously mentioned. The code to do so is still present in guix, I made a package using: (define-public linux-libre-fsfla-arm64-generic (make-linux-libre "5.8.1" "1v7glmvz3laj1awh5zrqclp2pzfs0cjf6y3n6v97j7z901s1vlxd" '("aarch64-linux") #:defconfig "defconfig" #:extra-version "fsfla-arm64-generic")) After patching the make-linux-libre call to also include a patch needed for newer versions: - (patches (list %boot-logo-patch))) + (patches (list %boot-logo-patch + %linux-libre-arm-export-__sync_icache_dcache-patch))) Not sure why that patch isn't upstream; Debian has been carrying it for some years now... and my guix build failed to build without it on aarch64/arm64. live well, vagrant [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 227 bytes --] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Linux-libre git repository 2020-08-13 16:47 ` Linux-libre git repository Vagrant Cascadian @ 2020-08-14 0:03 ` Jason Self 2020-08-14 14:03 ` Danny Milosavljevic 1 sibling, 0 replies; 17+ messages in thread From: Jason Self @ 2020-08-14 0:03 UTC (permalink / raw) To: Vagrant Cascadian; +Cc: guix-devel [-- Attachment #1: Type: text/plain, Size: 2375 bytes --] On Thu, 13 Aug 2020 09:47:21 -0700 Vagrant Cascadian <vagrant@reproducible-builds.org> wrote: > It is also possible to retrieve tarballs directly from linux-libre git > tags, though I know at least projects hosted on github this does > occasionally result in non-identical tarballs. Not sure what factors > might trigger this, other than changing tags, but possibly different > git versions, tar versions and flags, and compression tool versions > and optimizations could be a factor. Reproducible builds has > documented some potential causes: Adding in compression changes this because, for just one example, compression details can change between versions of compressors. Assuming that there is no compression and there aren't changes in the underlying git repository and assuming that git archive is invoked with precisely the same parameters each time, git archive is supposed to generate bit-identical tarballs between different platforms/versions of git (it's considered a bug if it doesn't.) Indeed, the Linux stable tree takes advantage of this reproducibility by adding a GPG signature for the uncompressed tarballs as a git note under refs/notes/signatures/tar. The signature also includes a comment with the precise command to regenerate the uncompressed tarball with git archive. This then makes it possible to verify a GPG signature of an uncompressed tarball that way. An example is [0]. cgit automatically adds the (sig) link when the corresponding git note is added in refs/notes/signatures/tar but they can also be accessed directly from within git. I found that useful after learning that GPG signatures within git itself "only validate the commit file contents up to the SHA-1 of the top level tree, it's not a GPG signature of the entire tree state. This means that a SHA-1 collision on the tree object, or any blob object, still results in a valid GPG signature." It seemed to be a neat way to sidestep the whole matter of SHA-1 falling apart, at least until git moves on to SHA-2 at some as-yet-unknown future point. Anyway, the Linux-libre git repository similarly contains GPG signatures for the uncompressed tarballs but as tags not as a git note but either way the outcome is the same. [0] https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ refs/notes/signatures/tar [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 801 bytes --] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Linux-libre git repository 2020-08-13 16:47 ` Linux-libre git repository Vagrant Cascadian 2020-08-14 0:03 ` Jason Self @ 2020-08-14 14:03 ` Danny Milosavljevic 1 sibling, 0 replies; 17+ messages in thread From: Danny Milosavljevic @ 2020-08-14 14:03 UTC (permalink / raw) To: Vagrant Cascadian; +Cc: guix-devel, Jason Self [-- Attachment #1: Type: text/plain, Size: 1219 bytes --] Hi Vagrant, On Thu, 13 Aug 2020 09:47:21 -0700 Vagrant Cascadian <vagrant@reproducible-builds.org> wrote: > The source checkout was quite slow to download, and took up ~1GB in the > store once completed. I'm not sure how guix's git origin works exactly; git init git remote add origin <url> if git fetch --depth 1 origin <commit> then git checkout FETCH_HEAD else echo "Failed to do a shallow fetch; retrying a full fetch..." git fetch origin git checkout <commit> fi if ,recursive? then git submodule update --init --recursive rm -rf .git for each submodule fi rm -rf .git See guix/build/git.scm . There exist git servers that have disabled fetching by commit hash for "security" reasons (if you checked in a file containing a password and then removed it again, and no branch or tag to it exists, nobody can get to it even if he knew the commit hash). We would always use the fallback for those servers. > if it downloads the entire git history even to perform a shallow > checkout of a single commit, and then throws out the git history? As a fallback if the above doesn't work. > appear to be calling git with flags to perform a shallow checkout. Yes. [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Linux-libre 5.8 and beyond 2020-08-13 0:39 ` Mark H Weaver 2020-08-13 16:47 ` Linux-libre git repository Vagrant Cascadian @ 2020-08-14 13:47 ` Alexandre Oliva 2020-08-15 6:03 ` Mark H Weaver 1 sibling, 1 reply; 17+ messages in thread From: Alexandre Oliva @ 2020-08-14 13:47 UTC (permalink / raw) To: Mark H Weaver; +Cc: guix-devel, Jason Self Hello, Mark, On Aug 12, 2020, Mark H Weaver <mhw@netris.org> wrote: > Mark H Weaver <mhw@netris.org> wrote: >>> the linux-libre project periodically deletes most of its older >>> tarballs, even if there are no accidents. > Jason Self <jason@bluehome.net> responded: >> Just FYI that git://linux-libre.fsfla.org/releases.git was created >> mainly to solve that problem. Versions are now pretty much permanent. > That's helpful, thanks. I didn't know about this. Out of curiosity, is > this git repository advertised anywhere? Not much. It was mentioned back in the announcements of 5.7-gnu and a few subsequent ones on social media; in the 5.7-gnu news entry in the Linux-libre web site, and in the documentation we wrote for Guix developers, that was sent to some of you not long ago. Though it was announced sort of widely, since this move was directed primarily at satisfying a Guix pain point, I figured I'd add it to downloads only after making sure it did address Guix's needs, so that, should it require significant changes, there wouldn't have to be much concern about backward compatibility with the current status quo. > One question: Would it solve the problem that I mentioned in my earlier > email, namely the problem of how to determine which precise commit > introduced a regression between two stable kernel releases? No. There are much better (faster and less risky) ways to tend to that requirement, see #bisecting below. >>> It may be useful for users with newer hardware devices, which are >>> not yet well supported by the latest stable release, to use an >>> arbitrary commit from either Linus' mainline git repository or some >>> other subsystem tree. >> >> The cleaning up scripts are version-specific and won't work on an >> "arbitrary commit from Linus's mainline git repository" (i.e., someone >> wanting to get today's most recent commit going into 5.9.) The scripts >> would fall over and die in such a scenario, > Okay, perhaps this was wishful thinking on my part. Yup. If you ran a deblob-check in verify mode on the resulting tarballs, you'd see how error-prone this is. You'd at least stop non-Free code from silently sneaking in and finding its way into running on users' machines. That's the *least* someone who runs the deblob-scripts on their own should do to smoke-test the result WRT *known* freedom issues. > I had hoped that the deblob scripts would typically mostly work, even > if they weren't able to do a comprehensive cleaning. I'd honestly hope for a much higher standard than that for a FSDG-compliant distro, especially one that carries the GNU mark. > I would oppose adding such a partly-cleaned kernel to Guix itself, But you don't! That's what you get when you jump the gun and use outdated cleaning up scripts, without waiting for us to verify, update and release them for a newer version. > but I wanted to enable users who need to use some other branch of > Linux on their own systems to make a best-effort cleaning. Besides the likelihood of something going wrong, that seems like a backwards goal for a distro that is not expected to as much as point users at a non-Free package. I'm sure that's not what you intend, but this arrangement, plus your mention of hurriedly getting releases out, adds up to an incentive to disable the deblobbing so as to get a faster build. I hope you'll agree that this is undesirable. As for how to speed up builds without sacrificing freedom, see below. >>> It allows us to update to a new point version (which usually >>> includes security fixes) more quickly, before the linux-libre >>> project reacts. >> >> Any attempt outrun the Linux-libre project and get updates out sooner >> is unwise. While major new kernel releases will definitely require >> updates to the cleanup scripts, even minor patched versions >> occasionally require changes too. Updating to a new version prior to >> the Linux-libre project having had time to review that new version and >> determine if any updates are needed to the scripts risks introducing >> freedom problems in the corresponding Guix version. > In my experience, the deblob scripts are very rarely changed after the > first few point releases of a stable release series. My personal experience tells me otherwise. 5.7 had only one update at .8; 5.6, at .6 and .16; 5.5, at .3, .11 and .19; 5.4, at .14, .18, .27, .34 and .44; 5.3, at .4 and .11; 5.2 at .1, .3 and .11; 5.1 at .2, .18 and .20; 5.0 at .7 and .16. What you describe was true only of 4.17, 4.10, 4.3, 3.13, 3.5, and 3.2, i.e. 6 out of the 50 major releases starting at 3.0. > I know this because I always check for updates to the deblob scripts > whenever I update linux-libre in Guix. In practice, the deblob scripts used by > Guix are never more than 1 or 2 micro versions behind the version of > Linux they are applied to. There have been 61 script updates for the 1274 4.*.*-gnu* and 5.*.*-gnu* stable releases, so Guix has shipped potentially non-FSDG code, that *would* have been flagged by deblob-check on the tarballs, at between 5% and 10% of these releases. Does that sound like a good standard for a freedom-first distro to aim for? >> The moment that the Linux-libre project determines that scripts are >> suitable is the moment that the new cleaned-up release is ready to >> publish in git and the appropriate tags will then appear in git. The >> compressed tarballs come some time later. > I prefer to avoid unnecessary delays when applying micro kernel updates, Sorry, but it doesn't look like you do. If you did, you would be taking a cleaned up tree instead of re-deblobbing it. You skip even the automated verification we do, which saves you some time, but at what price? If you waited another 30 minutes for our cleaned-up and verified tree to be available from git, you'd save yourself the 20 minutes of cleaning-up and another 20 minutes of deblob-checking the tarball for known or likely freedom issues. That sounds like a net win to me. Now, if your build machines clean up and verify much faster than ours, I'd be pretty glad to use them to get the verified commits in place so that you could use them faster. > I also consider it unwise for all of us, as a matter of habit or policy, > to trust the integrity of the computer systems used by the Linux-libre > project to perform the deblobbing. I welcome double-checking of our cleaning up at all levels, but why are you setting a higher trust standard for us than for a project known to be at odds with our shared goals, such as Linux? You don't apply the patches that went into it since the last known good release to double-check their releases, do you? For most projects, you just take their tarballs or tags and build it. For Linux-libre, you start from (untrustworthy) Linux, run the (presumed untrustworthy) cleaning up scripts, and blindly trust the result. There's no self-verification run with deblob-check, no compare with our release, nothing. If you were to test the integrity of our releases, you'd think you'd at least look at them. Starting from a known-good Linux release and applying patches to double-check the results is expensive, so it makes sense to do that only occasionally, rather than as part of every build. Deblobbing and checking the result is also expensive, so it also makes sense for you to do so only occasionally, rather than as part of every build. But the point stands that, for someone who'd rather trust no one, you're blindly trusting both Linux and Linux-libre. The former when it comes to base releases you don't check; the latter when it comes to scripts whose results you hardly even look at. Why not reduce your trust base to just Linux-libre, and treat is as a citizen of the same class as nearly every other project you build, and satisfy your trust-but-verify needs looking into what changes between one of our releases and another? #bisecting You can even take one of our releases and apply the patches that went into the next upstream stable release, and check that what you get matches our own corresponding release. Some 98% of the time, they will be exact matches. Occasionally, there will be a difference, and then you'll likely find a corresponding change in the deblobbing scripts, or a preexisting pattern that caused the change. We do this for every release, as part of our pre-release checks, and you're welcome to do so as well, and to use the resulting tree to bisect problems. You'll see the builds are much faster if you don't have to deblob every build. Now, we've long had plans to publish a cleaned-up repo with Linux git history. It would take a massive amount of work to get it started, but after that, getting new releases out might be about as fast as running a git merge. Ok, not that fast because there'd be some checking, but you get the idea. Other ideas to speed up our release process are: - enabling cleaning up in multiple concurrent processes, like 'make -j' - ditto for the deblob-check tarball verification - use faster machines for the above - monitor upstream git and fire up cleaning up automatically, and move the various manual signatures of commits, tarballs, and logs to the very end, after manual checking These would enable the git commits to be out sooner. Currently, our best case is to push a release to the git release archive about one hour after the upstream release, but we can only get that realistically when there's only one release to do. Most often, there are four to seven releases at once, and then, since we don't always react immediately (one needs to sleep occasionally ;-) and we don't get early warnings, especially about major security issues, and our main workhorse has limited capacity, we end up at about one hour per release anyway, having them all ready at about the same time. (Compression of tarballs then takes another half hour or more per release, but the releases are pushed to the git release archive long before compression is completed) -- Alexandre Oliva, happy hacker https://FSFLA.org/blogs/lxo/ Free Software Activist GNU Toolchain Engineer ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Linux-libre 5.8 and beyond 2020-08-14 13:47 ` Linux-libre 5.8 and beyond Alexandre Oliva @ 2020-08-15 6:03 ` Mark H Weaver 2020-08-16 1:24 ` Mark H Weaver ` (6 more replies) 0 siblings, 7 replies; 17+ messages in thread From: Mark H Weaver @ 2020-08-15 6:03 UTC (permalink / raw) To: Alexandre Oliva; +Cc: guix-devel, Jason Self Hi Alexandre, Alexandre Oliva <lxoliva@fsfla.org> wrote: > On Aug 12, 2020, Mark H Weaver <mhw@netris.org> wrote: > >>>> It may be useful for users with newer hardware devices, which are >>>> not yet well supported by the latest stable release, to use an >>>> arbitrary commit from either Linus' mainline git repository or some >>>> other subsystem tree. >>> >>> The cleaning up scripts are version-specific and won't work on an >>> "arbitrary commit from Linus's mainline git repository" (i.e., someone >>> wanting to get today's most recent commit going into 5.9.) The scripts >>> would fall over and die in such a scenario, > >> Okay, perhaps this was wishful thinking on my part. > > Yup. If you ran a deblob-check in verify mode on the resulting > tarballs, you'd see how error-prone this is. You'd at least stop > non-Free code from silently sneaking in and finding its way into running > on users' machines. That's the *least* someone who runs the > deblob-scripts on their own should do to smoke-test the result WRT > *known* freedom issues. What is this "verify mode" that you're referring to, and where is it documented? The word "verify" does not occur in either of the deblob scripts that I know about, namely "deblob-<VERSION>" and "deblob-check". The string "verif" occurs a few times, but nothing related to the script functionality. I don't see anything like a verification mode mentioned in the options documented at the top of those two scripts. For the record, it was not my intent to skip any automated checking provided by these scripts. If we're running the scripts in a suboptimal way, please tell me a better way. FYI, right now we're simply running the main 'deblob-<VERSION>' script with no arguments in the unpacked Linux source directory, with the corresponding 'deblob-check' script in $PATH and $PYTHON pointing to python 2.x. If 'deblob-<VERSION>' exits abnormally or with a non-zero result, the Guix build process fails. Last I checked, 'deblob-check' was certainly being run by 'deblob-<VERSION>' as a subprocess, because I had to make several substitutions of hard-coded paths before it would work in Guix (e.g. /bin/sed and /usr/bin/python). >> I had hoped that the deblob scripts would typically mostly work, even >> if they weren't able to do a comprehensive cleaning. > > I'd honestly hope for a much higher standard than that for a > FSDG-compliant distro, especially one that carries the GNU mark. As I wrote below: >> I would oppose adding such a partly-cleaned kernel to Guix itself, With this in mind, your accusation above is not relevant to Guix. Above, I was talking about my hope to enable users, *on their own machines* and using *their own private build recipes*, to make a best-effort deblobbing of a non-standard kernel variant that they need to use for whatever reason. If they aren't provided with that option, the obvious alternative (which I expect 99% of such users would do anyway) is to simply run a fully-blobbed kernel instead. > But you don't! That's what you get when you jump the gun and use > outdated cleaning up scripts, without waiting for us to verify, > update and release them for a newer version. Here you are conflating two substantially different scenarios: (1) Attempting to use your deblob scripts on a newer kernel that almost certainly includes many new drivers and blobs that aren't detected by your scripts. That's the case that I said I would oppose for inclusion in Guix. (2) Using the deblob scripts made for 5.4.57 on a 5.4.58 kernel in order to apply security fixes more quickly, and where the probability of uncleaned new blobs is quite low. >> but I wanted to enable users who need to use some other branch of >> Linux on their own systems to make a best-effort cleaning. > > Besides the likelihood of something going wrong, that seems like a > backwards goal for a distro that is not expected to as much as point > users at a non-Free package. It's *not* a goal for Guix, and it wasn't even my motivation for teaching Guix to run the Linux-libre deblob scripts. It's just something that, on a whim, I chose to include in my list of possible advantages to having such functionality, nothing more. > I'm sure that's not what you intend, but this arrangement, plus your > mention of hurriedly getting releases out, adds up to an incentive to > disable the deblobbing so as to get a faster build. I don't understand how you reached this conclusion. As far as I can tell, changing Guix to run the deblob scripts made *no* difference to what someone would have to do to ask Guix to build fully-blobbed kernel. > I hope you'll agree that this is undesirable. Agreed. >> In my experience, the deblob scripts are very rarely changed after the >> first few point releases of a stable release series. > > My personal experience tells me otherwise. 5.7 had only one update at > .8; 5.6, at .6 and .16; 5.5, at .3, .11 and .19; 5.4, at .14, .18, .27, > .34 and .44; 5.3, at .4 and .11; 5.2 at .1, .3 and .11; 5.1 at .2, .18 > and .20; 5.0 at .7 and .16. What you describe was true only of 4.17, > 4.10, 4.3, 3.13, 3.5, and 3.2, i.e. 6 out of the 50 major releases > starting at 3.0. I only checked your claims regarding 5.4, and found that you're mistaken about them being updated in 5.4.44. In fact, the 'deblob-5.4' and 'deblob-check' files, as found in /pub/linux-libre/releases/, have not changed since version 5.4.34. Moreover, of the 4 deblob updates (.14, .18, .27, and .34) that have *actually* been made so far during the 5.4.x series, IIUC only one of them declared new blobs to remove, namely the update for 5.4.27. The 5.4.14 update only removed extraneous backslashes in existing regexps, changing "\e" to "e" and "\@" to "@". I don't know whether these extraneous backslashes caused blobs to be included in the linux-libre tarballs, but if so, that presumably already happened in 5.4.13 and would have happened even if we had used your official tarballs, no? The 5.4.18 and 5.4.34 updates only added new 'accept' directives. I guess that means that temporarily omitting these additions wouldn't cause new blobs to be included, is that right? >> I know this because I always check for updates to the deblob scripts >> whenever I update linux-libre in Guix. In practice, the deblob scripts used by >> Guix are never more than 1 or 2 micro versions behind the version of >> Linux they are applied to. > > There have been 61 script updates for the 1274 4.*.*-gnu* and 5.*.*-gnu* > stable releases, so Guix has shipped potentially non-FSDG code, that > *would* have been flagged by deblob-check on the tarballs, at between 5% > and 10% of these releases. Does that sound like a good standard for a > freedom-first distro to aim for? If it were true that we've been including blobs in 5-10% of our linux-libre releases, I agree that would be a serious problem. However, I believe your estimates are way off, so I took a closer look at the statistics for the 5.4, 4.19, and 4.14 kernels. I already wrote about 5.4 above. If we include only the deblob updates that added checks for new blobs, it's only happened once in 58 upstream updates, i.e. for 1.7% of the updates. In the 4.19 series, although the deblob scripts have been updated 8 times, of those 8, 3 only add 'accept' directives, and a fourth only makes the same regexp fixes mentioned above ("\e" -> "e" and "\@" -> "@"). In other words, only 4 of these deblob updates might result in new blobs being recognized. So that's 4 new blob updates out of 139 upstream updates, which comes out to 2.9%. In the 4.14 series, the deblob scripts were updated 6 times, but 3 only add 'accept' directives and a fourth only makes the regexp fix. So that comes out to 2 new blob updates out of 193 upstream updates, which comes out to 1.0%. So, unless I missing something, it's more accurate to say that when I push a Linux-libre security update before waiting for you to bless it, I'm taking a 1-3% risk that a blob might end up in the result. I find that level of risk undesirable. I would certainly rather avoid it. I guess where you and I differ is that I *also* find it undesirable to subject our users to unnecessary delays in getting these security updates, because that *also* carries a risk, namely the risk that their systems will be compromised due to a delayed security update. To my mind, it makes sense to balance these two risks, especially since we know that it's simply impractical to completely eliminate the risk of non-FSDG-compliant code occasionally finding its way into Guix. >>> The moment that the Linux-libre project determines that scripts are >>> suitable is the moment that the new cleaned-up release is ready to >>> publish in git and the appropriate tags will then appear in git. The >>> compressed tarballs come some time later. > >> I prefer to avoid unnecessary delays when applying micro kernel updates, > > Sorry, but it doesn't look like you do. If you did, you would be taking > a cleaned up tree instead of re-deblobbing it. I'm not concerned about another 30 minutes (or whatever) to run the deblob scripts, especially if the alternative is to trust the integrity of your machines unnecessarily. The delays I'd prefer to avoid are ones measured in tens of hours, which is occasionally how long it takes before Linux-libre reacts to a new upstream update. > You skip even the automated verification we do, which saves you some > time, but at what price? As I wrote above, if there's some automated verification that we are failing to do, please tell me how to do it. It was certainly not my intent to skip any such verification. >> I also consider it unwise for all of us, as a matter of habit or policy, >> to trust the integrity of the computer systems used by the Linux-libre >> project to perform the deblobbing. > > I welcome double-checking of our cleaning up at all levels, but why are > you setting a higher trust standard for us than for a project known to > be at odds with our shared goals, such as Linux? I don't understand how you reached the conclusion that I'm setting a higher trust standard for Linux-libre than for Linux. The principle I'm following here is simply to avoid relying on the integrity of any system if I can easily avoid it. In particular, if I can easily run an automated process on my own machine instead of relying on some other system to provide pre-generated outputs for me, then I prefer to do it myself. > You don't apply the patches that went into it since the last known > good release to double-check their releases, do you? For most > projects, you just take their tarballs or tags and build it. That's true, and I agree that it's something we could improve on. It would be preferable to fetch from a git repository instead, and preferably one that has a lot of eyes on it. > For Linux-libre, you start from (untrustworthy) Linux, run the > (presumed untrustworthy) cleaning up scripts, and blindly trust the > result. I agree that we cannot avoid trusting many people and systems, and that in most cases that trust is blind. In this case, we cannot avoid trusting the Linux source code (even if we download exclusively from the Linux-libre project), and we cannot avoid trusting the Linux-libre deblob scripts. However, I reject the argument that because we must trust X and Y, we might as well trust Z as well. > There's no self-verification run with deblob-check, Again, if we're failing to do that, it's a bug that has not previously been brought to my attention. See above. > no compare with our release, nothing. If you were to test the > integrity of our releases, you'd think you'd at least look at them. I *did* compare with your releases when I first taught Guix how to run the deblob scripts, but not since then. Anyway, I fail to see the relevance of this fact. I agree that it would be useful for someone running Guix to compare our generated tarballs to yours. There are millions of useful things I *could* do with my time, but alas, my energies are limited. > If you were to test the integrity of our releases, you'd think you'd at > least look at them. Starting from a known-good Linux release and > applying patches to double-check the results is expensive, so it makes > sense to do that only occasionally, rather than as part of every build. > Deblobbing and checking the result is also expensive, so it also makes > sense for you to do so only occasionally, rather than as part of every > build. As far as I can tell, the vast majority of Guix users use substitutes provided by its build farm. I guess that it's fairly rare for people to build everything on their own machines, as I do. > But the point stands that, for someone who'd rather trust no one, you're > blindly trusting both Linux and Linux-libre. The former when it comes > to base releases you don't check; the latter when it comes to scripts > whose results you hardly even look at. Why not reduce your trust base > to just Linux-libre, That's not possible. Clearly, you do not have the capacity to audit all of the code that Linux produces. Therefore, by trusting Linux-libre, we must implicitly also trust the Linux project. That much we cannot avoid. We also cannot avoid trusting your deblob scripts. However, we *can* easily avoid trusting the integrity of the systems that you use to run the deblob scripts. > and treat is as a citizen of the same class as > nearly every other project you build, and satisfy your trust-but-verify > needs looking into what changes between one of our releases and another? You seem to be suggesting that I'm treating Linux-libre with less respect than other projects in Guix. I reject that claim. In fact, I strongly support reducing Guix's reliance on pre-generated outputs produced by *any* project. I'm not singling out the Linux-libre project here. For example, one of the things I'm recently been thinking about is that Guix currently trusts the integrity of all the scripts generated by autoconf/automake/libtool/etc for most of the tarballs that we download. Those scripts are generated on random developer machines, and they are very difficult to reproduce, because they depend on the precise versions of many other packages, and Debian also seems to have extensively modified their automake, leading to other differences. I would be in favor of working toward generating those scripts in Guix itself where possible, but it's a big job and likely to cause maintenance headaches. For another example, I also taught Guix how to generate the IceCat source tarball from the corresponding Firefox tarball, and I intend to keep it that way, although I'm currently an IceCat maintainer. >> One question: Would it solve the problem that I mentioned in my earlier >> email, namely the problem of how to determine which precise commit >> introduced a regression between two stable kernel releases? > > No. There are much better (faster and less risky) ways to tend to that > requirement, see #bisecting below. [...] > #bisecting > > You can even take one of our releases and apply the patches that went > into the next upstream stable release, and check that what you get > matches our own corresponding release. Some 98% of the time, they will > be exact matches. Occasionally, there will be a difference, and then > you'll likely find a corresponding change in the deblobbing scripts, or > a preexisting pattern that caused the change. We do this for every > release, as part of our pre-release checks, and you're welcome to do so > as well, and to use the resulting tree to bisect problems. I agree that this would be faster, but I fail to see how it's "less risky" than running the deblob scripts meant for Linux-libre X.Y.Z on a git checkout of the upstream stable git repo between X.Y.(Z-1) and X.Y.Z. More importantly, it's a much less straightforward thing to implement. In the current implementation, we get the ability to deblob arbitrary git commits from the same stable branch essentially for free. I guess you're suggesting that I should implement a radically different mechanism specifically for this purpose, that extracts the individual patches from the upstream stable git repository, attempt to apply them to the base Linux-libre release, compare that to the next Linux-libre release, and then implement my own bisection functionality. If I were to implement this, what would you suggest I do if the patches fail to apply, or if the result fails to match the next Linux-libre release? Thanks, Mark ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Linux-libre 5.8 and beyond 2020-08-15 6:03 ` Mark H Weaver @ 2020-08-16 1:24 ` Mark H Weaver 2020-08-16 12:43 ` Jason Self 2020-08-16 10:54 ` Jason Self ` (5 subsequent siblings) 6 siblings, 1 reply; 17+ messages in thread From: Mark H Weaver @ 2020-08-16 1:24 UTC (permalink / raw) To: Alexandre Oliva; +Cc: guix-devel, Jason Self Hi Alexandre, I thought about it some more, and I've changed my mind on one point: I've decided that for future kernel updates, in order to eliminate the risk of unintentionally allowing blobs into Guix, I will either wait for Linux-libre to publish updated deblob scripts, or else I will manually check for new blobs. Thanks, Mark ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Linux-libre 5.8 and beyond 2020-08-16 1:24 ` Mark H Weaver @ 2020-08-16 12:43 ` Jason Self 0 siblings, 0 replies; 17+ messages in thread From: Jason Self @ 2020-08-16 12:43 UTC (permalink / raw) To: Mark H Weaver; +Cc: guix-devel [-- Attachment #1: Type: text/plain, Size: 2311 bytes --] On Sat, 15 Aug 2020 21:24:08 -0400 Mark H Weaver <mhw@netris.org> wrote: > Hi Alexandre, > > I thought about it some more, and I've changed my mind on one point: > I've decided that for future kernel updates, in order to eliminate the > risk of unintentionally allowing blobs into Guix, I will either wait > for Linux-libre to publish updated deblob scripts, or else I will > manually check for new blobs. This can be determined by checking for the availability of the new kernel version in git. The git repository is updated first, prior to tarballs being created so I assume you'd want to be looking there given that speed of updates seems important. If the new kernel version appears without a corresponding script update then you can know that no script updates were determined to be necessary. Wouldn't a better setup be to obtain the desired kernel version from Linux-libre, obtain the desired kernel version from kernel.org, independently run the clean-up scripts, and then toss out the results from kernel.org once the source code is determined to be identical?* I mean, if you're already willing to wait until the analysis of whether updated cleanup scripts are needed or not has been done, then you're already at the point of the Linux-libre kernel source code being available too because once that determination is made, any updated scripts and the corresponding kernel source code are pushed into git simultaneously. Confirming if the results you get from the cleanup scripts are the same is helpful all around. It is not necessary to trust the Linux-libre project infrastructure because you're also verifying the integrity and also gets you access to the double verification steps that are done which check that the version does in fact correspond to the upstream version plus the changes that Linux-libre made, and that it also corresponds to the previous release plus the incremental patches. * As a disclaimer there may be one difference in that the clean-up scripts will in some cases delete all of the files in a directory while leaving the directory itself in place. Git doesn't track empty directories and so diffing of the entire kernel source code would reveal that. The diff should otherwise report everything to be identical. [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 801 bytes --] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Linux-libre 5.8 and beyond 2020-08-15 6:03 ` Mark H Weaver 2020-08-16 1:24 ` Mark H Weaver @ 2020-08-16 10:54 ` Jason Self 2020-08-24 3:45 ` Alexandre Oliva ` (4 subsequent siblings) 6 siblings, 0 replies; 17+ messages in thread From: Jason Self @ 2020-08-16 10:54 UTC (permalink / raw) To: Mark H Weaver; +Cc: guix-devel [-- Attachment #1: Type: text/plain, Size: 100 bytes --] I always thought the reproducible builds mantra was "trust but verify", not to actively distrust? [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 801 bytes --] ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Linux-libre 5.8 and beyond 2020-08-15 6:03 ` Mark H Weaver 2020-08-16 1:24 ` Mark H Weaver 2020-08-16 10:54 ` Jason Self @ 2020-08-24 3:45 ` Alexandre Oliva 2020-08-25 4:14 ` Mark H Weaver 2020-08-24 3:58 ` Alexandre Oliva ` (3 subsequent siblings) 6 siblings, 1 reply; 17+ messages in thread From: Alexandre Oliva @ 2020-08-24 3:45 UTC (permalink / raw) To: Mark H Weaver; +Cc: guix-devel, Jason Self Hello, Mark, Apologies for the delay in responding. It's been an "interesting" week. I'm breaking up what turned out to be a very very long reply into multiple posts, so as to address the various issues in separate posts, that might very well turn into separate subthreads. On Aug 15, 2020, Mark H Weaver <mhw@netris.org> wrote: > Alexandre Oliva <lxoliva@fsfla.org> wrote: >> No. There are much better (faster and less risky) ways to tend to that >> requirement, see #bisecting below. > [...] >> #bisecting >> >> You can even take one of our releases and apply the patches that went >> into the next upstream stable release, and check that what you get >> matches our own corresponding release. Some 98% of the time, they will >> be exact matches. Occasionally, there will be a difference, and then >> you'll likely find a corresponding change in the deblobbing scripts, or >> a preexisting pattern that caused the change. We do this for every >> release, as part of our pre-release checks, and you're welcome to do so >> as well, and to use the resulting tree to bisect problems. > I agree that this would be faster, but I fail to see how it's "less > risky" than running the deblob scripts meant for Linux-libre X.Y.Z on a > git checkout of the upstream stable git repo between X.Y.(Z-1) and > X.Y.Z. You're right. My mistake was failing to mention the need to compare between X.Y.Z-gnu* and the result of rebasing the X.Y.(Z-1)..X.Y.Z patches onto X.Y.(Z-1)-gnu* to enjoy the safety I had in mind. If the changes made to both ends are the same, then it's not entirely unreasonable to assume that all intermediate commits would also be properly cleaned up with the same set of changes, assuming the history in between is linear (i.e., only cherry-picks, not merges that could take a bisect to much earlier commits). Even with linear history, you might admittedly still be surprised if an intervening patch introduces something undesirable and a subsequent one reverses it. Running through deblob-check each version of each modified file that matches neither boundary would mechanically avoid nearly all such surprises. > More importantly, it's a much less straightforward thing to implement. It really isn't. Once you remove the deblobbing and point the recipe directly at the linux-libre git repo, users will be able to do the above rebasing on a local repo, and build reasonably quickly any of the commits that the local git bisect tells them to try. > In the current implementation, we get the ability to deblob arbitrary > git commits from the same stable branch essentially for free. Taking arbitrary commits from a known non-Free repo is really not something to be encouraged, given the odds of hitting freedom problems. When using the latest scripts for a stable series, odds are the procedure you suggest would work more or less reliably within that stable series, but we've recently had an example in the 5.7-gnu series in which it wouldn't. > I guess you're suggesting that I should implement a radically different > mechanism specifically for this purpose, that extracts the individual > patches from the upstream stable git repository, attempt to apply them > to the base Linux-libre release, compare that to the next Linux-libre > release, and then implement my own bisection functionality. git rebase --onto libre/vX.Y.(Z-1)-gnu nonfree/vX.Y.(Z-1) nonfree/vX.Y.Z git diff libre/vX.Y.Z-gnu git bisect start HEAD libre/vX.Y.(Z-1)-gnu > If I were to implement this, what would you suggest I do if the patches > fail to apply Look at the conflict presented by the rebase, and resolve the likely freedom issue introduced at that point. > if the result fails to match the next Linux-libre release? Identify the intervening commit where code that got cleaned up differently was introduced or removed and make the change to the deblobbing at that point; rinse and repeat. If differences remain that are not caused by the patches, it's something that changed in the scripts, possibly improving or correcting an earlier deblobbing error, e.g., something cleaned up that was found to be Free, or something that was missed in deblobbing, or a different way to clean it up. Such differences will likely be noticeable in the scripts. It's probably best to turn the difference in cleaning up into a separate commit for the purposes of the bisection, just in case it is the source of the issue being investigated. -- Alexandre Oliva, happy hacker https://FSFLA.org/blogs/lxo/ Free Software Activist GNU Toolchain Engineer ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Linux-libre 5.8 and beyond 2020-08-24 3:45 ` Alexandre Oliva @ 2020-08-25 4:14 ` Mark H Weaver 2020-08-25 11:12 ` Alexandre Oliva 0 siblings, 1 reply; 17+ messages in thread From: Mark H Weaver @ 2020-08-25 4:14 UTC (permalink / raw) To: Alexandre Oliva; +Cc: guix-devel, Jason Self Hi Alexandre, Alexandre Oliva <lxoliva@fsfla.org> wrote: > On Aug 15, 2020, Mark H Weaver <mhw@netris.org> wrote: > >> If I were to implement this, what would you suggest I do if the patches >> fail to apply > > Look at the conflict presented by the rebase, and resolve the likely > freedom issue introduced at that point. > >> if the result fails to match the next Linux-libre release? > > Identify the intervening commit where code that got cleaned up > differently was introduced or removed and make the change to the > deblobbing at that point; rinse and repeat. In other words, your proposed approach cannot be done automatically in the general case. Do you see how this is a problem? If a Guix user reports that one of their devices stopped working in Linux-libre-5.4.34, I'd like to enable them to easily build deblobbed kernels at intermediate commits on the upstream stable/linux-5.4.y branch. With the present approach, I can provide a simple Guix recipe to do this automatically. With your proposed approach, the user may need to manually resolve merge conflicts and so on. Not all Guix users will have the skills or motivation to do this. Even _I_ would not want to do this, because it would mean doing unnecessary labor. I would much rather have my computer do this job while I do something else, even if it takes longer. Mark ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Linux-libre 5.8 and beyond 2020-08-25 4:14 ` Mark H Weaver @ 2020-08-25 11:12 ` Alexandre Oliva 0 siblings, 0 replies; 17+ messages in thread From: Alexandre Oliva @ 2020-08-25 11:12 UTC (permalink / raw) To: Mark H Weaver; +Cc: guix-devel, Jason Self Hello, Mark, On Aug 25, 2020, Mark H Weaver <mhw@netris.org> wrote: > Alexandre Oliva <lxoliva@fsfla.org> wrote: >> On Aug 15, 2020, Mark H Weaver <mhw@netris.org> wrote: >> >>> If I were to implement this, what would you suggest I do if the patches >>> fail to apply >> >> Look at the conflict presented by the rebase, and resolve the likely >> freedom issue introduced at that point. >> >>> if the result fails to match the next Linux-libre release? >> >> Identify the intervening commit where code that got cleaned up >> differently was introduced or removed and make the change to the >> deblobbing at that point; rinse and repeat. > In other words, your proposed approach cannot be done automatically in > the general case. Do you see how this is a problem? Remember a few emails ago when *you* argued that the changes to the deblobbing scripts were so infrequent that it would do no harm to use the scripts for an earlier release, without waiting for us to check that they work for a newer one? How come now the very same circumstances have become so frequent as to be a problem? You see, the cases in which there would be patch conflicts and need for manual resolution are those in which the cleanups made by older scripts are no longer enough to clean up subsequent trees. Most of these are cases in which manual intervention is required to adjust the scripts. But you wish to use the scripts to clean up intervening commits that it was never tested to work on and that it may actually fail on, leaving non-FSDG bits in place, *instead* of using a procedure that will reliably tell you about the IYO rare cases in which manual intervention is required. You dismiss our automated and manual verifications, you used to use outdated scripts, but you can't be bothered to run a simple procedure to check that there aren't freedom issues introduced in an upstream stable release, and to make the required manual adjustments to keep the builds FSDG-compliant? Heck, I'll be glad to publish, upon request, in the Linux-libre git repo, a verified-FSDG incremental stable release branch, i.e., a branch starting from one release and ending at a tree identical to that of a subsequent release in the same stable branch. Then you can point users at that branch for bisecting within that range. Should that be as seamless as I expect it to be, I might even start doing that regularly, for all stable releases, as an incremental step towards the git repo with the cleaned-up commit history of Linux development. > If a Guix user reports that one of their devices stopped working in > Linux-libre-5.4.34, I'd like to enable them to easily build deblobbed > kernels at intermediate commits on the upstream stable/linux-5.4.y > branch. That amounts to referring users to a non-Free source repo; that's not desirable per the FSDG. If we want to enable them to bisect for us, we should offer them our own cleaned-up history. > With the present approach, I can provide a simple Guix recipe > to do this automatically. ... and quite prone to misuse due to unreasonable expectations that our scripts can't live up to. > With your proposed approach, the user may > need to manually resolve merge conflicts and so on. Not all Guix users > will have the skills or motivation to do this. *We*, not the users, should get to prepare the cleaned-up repo and fix the conflicts, if we wish them to bisect for us. > Even _I_ would not want to do this, because it would mean doing > unnecessary labor. It would be unnecessary if we had magic scripts that worked reliably no matter what you threw at them. This is not the case, and your expectation that it is makes you perceive that as unnecessary labor. That we do only part of that labor, checking only actual releases rather than all intervening commits, might have been enough of a hint that it takes actual labor to get what you expect to get with zero effort. > I would much rather have my computer do this job > while I do something else, even if it takes longer. It can do much of it, but not all of it as you expect. You're unfortunately objecting to the manual labor in the very cases in which your computer is unable to do it :-/ -- Alexandre Oliva, happy hacker https://FSFLA.org/blogs/lxo/ Free Software Activist GNU Toolchain Engineer ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Linux-libre 5.8 and beyond 2020-08-15 6:03 ` Mark H Weaver ` (2 preceding siblings ...) 2020-08-24 3:45 ` Alexandre Oliva @ 2020-08-24 3:58 ` Alexandre Oliva 2020-08-24 4:12 ` Alexandre Oliva ` (2 subsequent siblings) 6 siblings, 0 replies; 17+ messages in thread From: Alexandre Oliva @ 2020-08-24 3:58 UTC (permalink / raw) To: Mark H Weaver; +Cc: guix-devel, Jason Self On Aug 15, 2020, Mark H Weaver <mhw@netris.org> wrote: > Alexandre Oliva <lxoliva@fsfla.org> wrote: >> On Aug 12, 2020, Mark H Weaver <mhw@netris.org> wrote: >> >>>>> It may be useful for users with newer hardware devices, which are >>>>> not yet well supported by the latest stable release, to use an >>>>> arbitrary commit from either Linus' mainline git repository or some >>>>> other subsystem tree. >>>> >>>> The cleaning up scripts are version-specific and won't work on an >>>> "arbitrary commit from Linus's mainline git repository" (i.e., someone >>>> wanting to get today's most recent commit going into 5.9.) The scripts >>>> would fall over and die in such a scenario, >> >>> Okay, perhaps this was wishful thinking on my part. >> >> Yup. If you ran a deblob-check in verify mode on the resulting >> tarballs, you'd see how error-prone this is. You'd at least stop >> non-Free code from silently sneaking in and finding its way into running >> on users' machines. That's the *least* someone who runs the >> deblob-scripts on their own should do to smoke-test the result WRT >> *known* freedom issues. > What is this "verify mode" that you're referring to, and where is it > documented? I'm talking about the --list-blobs (default) option of deblob-check, that tests whether an input file (source file, patch file, or tarball) contains any suspicious patterns. Running deblob-check with --help prints a significant amount of documentation, though it is mostly aimed at the internal purposes that the scripts serve. The cleaning up scripts are not really meant to be blindly used by third parties to clean up anything but releases they're associated with; they're provided for documentation and transparency purposes, but they're not even something whose existence you should count on. E.g., once we realize the long-term vision of having a git repo with the entire history, manually cleaned-up, there won't be a script to clean things up any more, though there will surely still be something to help us identify anything that needs cleaning up. > The word "verify" does not occur in either of the deblob > scripts that I know about That's what the 'check' in deblob-check stands for. Originally, it would only scan for blobs. Later on, it was extended with other actions for use in cleaning up. > I don't see anything like a verification mode mentioned > in the options documented at the top of those two scripts. Indeed, deblob-<VERSION>, which is what you use, and deblob-check, as used by it, do not perform any verification whatsoever. They're not meant to. They automate and document what we intend to clean up. The verifications are steps we take once we have a candidate release, that well us whether or not it's fit for release. If it isn't, we adjust the scripts and start over. You'd have to run deblob-check linux-libre-<VERSION>-guix.tar on the cleaned up tarball to check that none of the suspicious patterns known by deblob-check have survived in the resulting tarball. It would have caught the errors that Vagrant hit the other day, and it would have reported the deblobbing errors you'd have got this week had you not waited for the updated scripts. Running this script in -B or -C modes is part of our development process for new releases, and it is also one of our safety nets to stop us from releasing non-Free Software: we run it for every release before putting it out. > For the record, it was not my intent to skip any automated checking > provided by these scripts. I understand it was not your intent, but using the scripts in environments it wasn't tested, with upstream releases or commits it wasn't meant for, the expectation that it will do the job you wish without any of the verification steps we perform is misplaced. > If we're running the scripts in a suboptimal > way, please tell me a better way. > FYI, right now we're simply running the main 'deblob-<VERSION>' script > with no arguments in the unpacked Linux source directory, with the > corresponding 'deblob-check' script in $PATH and $PYTHON pointing to > python 2.x. If 'deblob-<VERSION>' exits abnormally or with a non-zero > result, the Guix build process fails. > Last I checked, 'deblob-check' was certainly being run by > 'deblob-<VERSION>' as a subprocess, because I had to make several > substitutions of hard-coded paths before it would work in Guix > (e.g. /bin/sed and /usr/bin/python). The expected use of the scripts, for people who wish to verify that our releases have been cleaned up as specified in the scripts, is to do just what you do, and then compare the resulting source tree with that of our release. If they match, you know we haven't sneaked in any unintended changes. If they don't, something went wrong on either end. Given our amount of experience and automation in the release and verification processes, that scan the resulting source tree and also compare the changes with those made by an earlier known good recent release, a platform-specific bug in the underlying tools, an unexpected change to regexp engines (as in some recent version of python3), the use of mismatched scripts are more likely sources of differences than our failing to notice an unexpected change on our ends. Now, if you wanted to use the scripts for purposes other than verification, e.g., to clean up releases before we check them, or even after we put them out but without any attempt to verify that the result you get is indeed what we put out, you should take responsibility for verifying the releases at least as much as we do, otherwise any freedom issues arising from your not catching a problem we would have caught would unfairly reflect negatively on our project. -- Alexandre Oliva, happy hacker https://FSFLA.org/blogs/lxo/ Free Software Activist GNU Toolchain Engineer ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Linux-libre 5.8 and beyond 2020-08-15 6:03 ` Mark H Weaver ` (3 preceding siblings ...) 2020-08-24 3:58 ` Alexandre Oliva @ 2020-08-24 4:12 ` Alexandre Oliva 2020-08-24 4:34 ` Alexandre Oliva 2020-08-24 4:42 ` Alexandre Oliva 6 siblings, 0 replies; 17+ messages in thread From: Alexandre Oliva @ 2020-08-24 4:12 UTC (permalink / raw) To: Mark H Weaver; +Cc: guix-devel, Jason Self On Aug 15, 2020, Mark H Weaver <mhw@netris.org> wrote: > I only checked your claims regarding 5.4, and found that you're mistaken > about them being updated in 5.4.44. There was a change to scripts at 5.4.44, just not one you cared about, because you didn't use the (discontinued) deblob-main script to prepare a cleaned-up source tarball. > Moreover, of the 4 deblob updates (.14, .18, .27, and .34) that have > *actually* been made so far during the 5.4.x series, IIUC only one of > them declared new blobs to remove, namely the update for 5.4.27. That's missing the point. Nearly all of these changes were motivated by changes reported as suspicious in our verification. Some turned out to be false positives, but they might as well have been new blobs. Any change has a potential to introduce new blobs, and the fact that our verification catches suspicious changes that you'd have quietly published as Free Software is the risk you're passing on to your users instead of living up to the expectation that you're doing your best to ensure they're not getting any non-Free Software from you. The value we provide, of checking that for every release, you're throwing on the floor. Yes, the releases that would *actually* introduce undesirable changes, vs merely suspicious ones that turn out to be false positives, are a smaller fraction of the total. But what you're doing right now is driving with blinders on because then you can go faster, because history has shown there's only a 5% or 2% chance of hitting a bus. > The 5.4.14 update only removed extraneous backslashes in existing > regexps, changing "\e" to "e" and "\@" to "@". That was in response to a change in python3.7 (?) regexp engine. Fortunately, all one got from the extraneous backslashes were warnings. But it could have been an actual change in output, or a failure to match a pattern that ought to have been cleaned up, and since you don't compare with our releases, you could have got non-Free results as much as from a newly-introduced bit from upstream. > I don't know whether these extraneous backslashes caused blobs to be > included in the linux-libre tarballs, but if so, that presumably > already happened in 5.4.13 and would have happened even if we had used > your official tarballs, no? No. If we'd hit it ourselves, our release engineering procedures would have caught the unexpected change. That's why treating our scripts (rather than our releases) as the ultimate truth, is error prone: the underlying tools are complex and subject to change and bugs. If you don't verify that their output isn't garbage (by comparing with our manually verified releases, or by performing equivalent automated and manual checks), you may end up shipping that garbage. Odds are that you already have. > The 5.4.18 and 5.4.34 updates only added new 'accept' directives. I > guess that means that temporarily omitting these additions wouldn't > cause new blobs to be included, is that right? You're probably right for these instances, but it does not necessarily follow that script changes that only add 'accept' patterns wouldn't get you in trouble without them. At times, we've had to add accept statements to match newly-added occurrences of '.firmware' in such constructs as: struct foo var = { .whatever = value, .firmware = "filename", ... }; These initializers are regarded as suspicious, so they need to be manually marked as accepted, whether or not the filename turns out to be a blob name that we clean up. Without arranging for a newly-introduced '.firmware' initializer to be accepted, this may end up cleaned up into: struct foo var = { .whatever = value, /*(DEBLOBBED)*/ "/*(DEBLOBBED)*/", ... }; which will get you a successful cleaning up session (say, if the firmware name was already known, in a file that we already cleaned up), and even a successful compilation, but, depending on the order of the fields in struct foo, the cleaned-up firmware name may end up used to initialize the wrong field. >>> I know this because I always check for updates to the deblob scripts >>> whenever I update linux-libre in Guix. In practice, the deblob scripts used by >>> Guix are never more than 1 or 2 micro versions behind the version of >>> Linux they are applied to. >> >> There have been 61 script updates for the 1274 4.*.*-gnu* and 5.*.*-gnu* >> stable releases, so Guix has shipped potentially non-FSDG code, that >> *would* have been flagged by deblob-check on the tarballs, at between 5% >> and 10% of these releases. Does that sound like a good standard for a >> freedom-first distro to aim for? > If it were true that we've been including blobs in 5-10% of our > linux-libre releases, I agree that would be a serious problem. Not what I meant, FWIW. What I meant was that in 5-10% of the times you might have *known* you had something wrong in your cleaned up tree if you'd just run deblob-check on it for one of the automated verifications. > I already wrote about 5.4 above. If we include only the deblob updates > that added checks for new blobs, it's only happened once in 58 upstream > updates, i.e. for 1.7% of the updates. The statistics you're using, counting only the suspicious changes that were not false positives, is analogous to saying that jaywalking, or driving across a red light, without even looking, are acceptable as long as you don't get hit or caught. Getting lucky 90%, 95% or even 98% of the time doesn't make up for disregarding the procedures that would have warned you of avoidable issues, whether or not they turn out to be actual freedom issues. The other reason you got much lower results than me was that I made room for your recipe's lagging for up to 2 releases (thus the 5% of stable releases requiring deblobbing changes turn to 10%), as you'd said, while you seem to have done yours assuming they'd lag for at most 1. -- Alexandre Oliva, happy hacker https://FSFLA.org/blogs/lxo/ Free Software Activist GNU Toolchain Engineer ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Linux-libre 5.8 and beyond 2020-08-15 6:03 ` Mark H Weaver ` (4 preceding siblings ...) 2020-08-24 4:12 ` Alexandre Oliva @ 2020-08-24 4:34 ` Alexandre Oliva 2020-08-24 4:42 ` Alexandre Oliva 6 siblings, 0 replies; 17+ messages in thread From: Alexandre Oliva @ 2020-08-24 4:34 UTC (permalink / raw) To: Mark H Weaver; +Cc: guix-devel, Jason Self On Aug 15, 2020, Mark H Weaver <mhw@netris.org> wrote: > Alexandre Oliva <lxoliva@fsfla.org> wrote: >> On Aug 12, 2020, Mark H Weaver <mhw@netris.org> wrote: >>> I also consider it unwise for all of us, as a matter of habit or policy, >>> to trust the integrity of the computer systems used by the Linux-libre >>> project to perform the deblobbing. >> >> I welcome double-checking of our cleaning up at all levels, but why are >> you setting a higher trust standard for us than for a project known to >> be at odds with our shared goals, such as Linux? > I don't understand how you reached the conclusion that I'm setting a > higher trust standard for Linux-libre than for Linux. You blindly trust Linux release tags, but not ours. OTOH, you're right that it's not a strictly higher standard. You also trust our cleanup scripts, even when we tell you they're not fit for the use cases you put them through. > The principle I'm following here is simply to avoid relying on the > integrity of any system if I can easily avoid it. You could avoid relying on the integrity of Linux release tags, and trust ours instead. That's what tells me you don't trust Linux-libre as much as you do Linux. You could use our tags, at the very least to check that you got something sensible out of your own deblobbing run, but you don't even look at them. You're not checking anything, so what you put builders through is at best busy, redundant work, and at worst, a waste of cpu cycles that doesn't even get them what they hope for. > However, I reject the argument that because we must > trust X and Y, we might as well trust Z as well. That doesn't follow indeed. What I'm saying was that, instead of trusting both X and Y, you might trust just X instead, while you insisted on trusting mostly just Y instead (but also X and a bunch of other tools used underneath). >> But the point stands that, for someone who'd rather trust no one, you're >> blindly trusting both Linux and Linux-libre. The former when it comes >> to base releases you don't check; the latter when it comes to scripts >> whose results you hardly even look at. Why not reduce your trust base >> to just Linux-libre, > That's not possible. Clearly, you do not have the capacity to audit all > of the code that Linux produces. Therefore, by trusting Linux-libre, we > must implicitly also trust the Linux project. That much we cannot > avoid. We also cannot avoid trusting your deblob scripts. True, we don't even attempt to audit Linux sources in this sense. This seems to imply that taking our cleaned-up sources, and taking Linux' sources and cleaning them up, carries exactly the same amount of trust on each project involved. And yet you prefer to trust the one that sneaks non-FSDG bits in every now and again, instead of the one that hunts them down and removes them. > However, we *can* easily avoid trusting the integrity of the systems > that you use to run the deblob scripts. You *could* avoid that, and also some blind trust on the underlying tools and systems used for cleaning up by us and by you, by at least *comparing* the cleaned-up tree you get with the one we provide. But that's not what you do. You distrust us enough to shed doubts on our processes, but you (and guix builders, trusting you) trust us enough to run our scripts for purposes they aren't fit, and trust a very complex and fragile combination of tools and systems to carry out its difficult job without giving their output a second look. > In fact, I strongly support reducing Guix's reliance on pre-generated > outputs produced by *any* project. I'm not singling out the Linux-libre > project here. You really are. You take most other projects' releases without anything even close to the amount of scrutiny and disregard that you place on the results of our release engineering processes and resulting release tarballs and tags. You might not think so if you consider the deblobbing scripts we publish for transparency and verification as our releases, but since they (very) occasionally remain unchanged even when new changes need to be made (*), say because they by chance already contain the code that makes the newly-needed changes, that supposed equivalence is a mistake. (*) just as I write this, I manually check 5.8.3-gnu and find a fresh example of this, also applicable to 5.7.17-gnu and 5.4.60-gnu. -- Alexandre Oliva, happy hacker https://FSFLA.org/blogs/lxo/ Free Software Activist GNU Toolchain Engineer ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: Linux-libre 5.8 and beyond 2020-08-15 6:03 ` Mark H Weaver ` (5 preceding siblings ...) 2020-08-24 4:34 ` Alexandre Oliva @ 2020-08-24 4:42 ` Alexandre Oliva 6 siblings, 0 replies; 17+ messages in thread From: Alexandre Oliva @ 2020-08-24 4:42 UTC (permalink / raw) To: Mark H Weaver; +Cc: guix-devel, Jason Self Hello, Mark, On Aug 15, 2020, Mark H Weaver <mhw@netris.org> wrote: > I was talking about my hope to enable users, *on their own > machines* and using *their own private build recipes*, to make a > best-effort deblobbing of a non-standard kernel variant that they need > to use for whatever reason. A non-free kernel, standard or not, shouldn't really be in scope for a FSDG distro, IMHO. Even the pointer to the non-Free releases used as a starting point for build recipes comes across as undesirable to me, more so when there's an expectation (and such a high concern) for enabling users to use them, with a near-certainty that this will likely go silently wrong freedom-wise. > If they aren't provided with that option, > the obvious alternative (which I expect 99% of such users would do > anyway) is to simply run a fully-blobbed kernel instead. I'm surprised that they'd prefer to run deblobbing and checking at each point of a bisection, over applying the deblobbing changes as a patch, or even starting from a Free release, rebasing a set of changes to test onto it, and quickly building and bisecting that. That's what I would rather do. But then, I probably wouldn't be using the guix build recipe and default kernel config for the bisection, but rather a smaller config built within the bisect tree. > Alexandre Oliva <lxoliva@fsfla.org> wrote: >> I'm sure that's not what you intend, but this arrangement, plus your >> mention of hurriedly getting releases out, adds up to an incentive to >> disable the deblobbing so as to get a faster build. > I don't understand how you reached this conclusion. As far as I can > tell, changing Guix to run the deblob scripts made *no* difference to > what someone would have to do to ask Guix to build fully-blobbed > kernel. One of the issues, as you'd pointed out, was time pressure to get a build completed. If someone is under such pressure, and knows that deblobbing will take 30 minutes, and that verifying the deblobbed tree will take another 30 minutes (or 24 hours, if using the wrong tool for the job), one might disable the cleaning up rather than figuring out how to get the recipe to use an already cleaned and verified release. > In particular, if I can easily run an > automated process on my own machine instead of relying on some other > system to provide pre-generated outputs for me, then I prefer to do it > myself. That's at odds with the time pressure you mentioned before. Now, let me get something straight. You seem to have got the idea that I oppose verification of our releases. That's very very wrong. I welcome verification. I just don't see that this belongs in the guix build process. I get it that guix packages several projects that need cleaning up. IMHO, guix build recipes should NOT point at such upstream projects along with the cleaning up recipes. This should be part of a separate recipe, namely, that of packaging/verifying/blessing *sources* for use in guix. Once the sources are packaged (in a verifiable/reproducible way), they should be made available by the distro to users. These are the corresponding sources that we expect every distro to offer. It's not just about builders getting those sources, verifying them (or not) and making binaries out of them. Any user ought to be entitled to request corresponding sources to binaries provided by guix, and guix should be able to provide them without requiring users to run potentially complex procedures that might even end up producing different results, depending on platform-specific bugs, versions of tools, not to mention the various other potential sources of non-reproducible sources and binaries. Even if the procedures are meant to be reproducible, you'll only know they aren't when you manage to trace a difference in a packaged binary back to a difference in sources, when you can no longer reproduce the sources used before. Archiving the sources proper, for verification and for distribution to users as corresponding sources, would avoid surprises of non-reproducible procedures being found out long after the fact, just when corresponding sources are requested and can't be provided any more. I'm ambivalent as to whether patches that guix wishes to apply should be applied as part of source packaging, or have the patches made available separately. I can see arguments both ways. On the one hand, applying patches, as reproducible as it normally is, might be subject to occasional variations, especially when the line numbers or the contexts in the patch are inexact. On the other, these cases are extremely rare, and being able to reuse a base tarball while trying out some patches, without having to repackage a base tarball, and having patches conspicuously presented to builders and users, separately from an upstream base release, is desirable when the patches are not meant to address freedom issues (those that do address freedom issues had better be applied by other means during source packaging, to avoid publishing reversible patches that could be used to reintroduce the freedom issues). This suggests there could be support for patches in both source preparation recipes, and in build recipes. For projects that need cleaning up, the source packaging recipe could apply any needed cleaning up. For projects like GNU Linux-libre, that are already cleaned up, or most other packages that don't need any cleaning up whatsoever, the source preparation recipe could be as simple as downloading the sources, as well as any signatures thereof, checking that they match, and recording the checksums of the sources to be used for binary building. Source preparation might also offer a verify mode, that would *also* fetch the sources from a corresponding release of the project that needs cleaning up, perform the cleaning up and compare the results, but I'd much rather links to the corresponding projects that need cleaning up be pushed out of FSDG-compliant distros. Maintainers of such packages could and probably should run such verification themselves, without exposing every builder to the non-Free pointers and code. I urge guix to address the problem of build recipes pointing to non-Free packages and getting builders to download non-Free Software onto their machines. It would probably be wise to discuss more broadly how FSDG distros can document and share their cleaning up procedures, so that builders and users can double-check them if they wish to, and so that other FSDG distros can cooperate and reuse. Clearly we don't wish distro maintainers to keep these private to themselves, but we surely don't want links to sources containing unacceptable sources to be conspicuous in the distro either, let alone being used when Free sources are or could be readily available. Now, I wonder... If sources for projects other than Linux-libre and GNUzilla need cleaning up, perhaps it would make sense for our community, e.g. GNU, to undertake the source cleaning up, releasing clean sources for all interested distros and users to get to. Perhaps we could encourage maintainers of the such packages in the various Free distros to share and divide the workload of maintaining them and the cleaning up recipes, for everyone's benefit. Then guix could just point at the clean sources released by this project, instead of going through the significant change of introducing separate 'prepare sources' recipes to avoid pointing users at non-Free sources through build recipes. -- Alexandre Oliva, happy hacker https://FSFLA.org/blogs/lxo/ Free Software Activist GNU Toolchain Engineer ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2020-08-25 11:13 UTC | newest] Thread overview: 17+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2020-08-09 20:15 Linux-libre 5.8 and beyond Jason Self 2020-08-13 0:39 ` Mark H Weaver 2020-08-13 16:47 ` Linux-libre git repository Vagrant Cascadian 2020-08-14 0:03 ` Jason Self 2020-08-14 14:03 ` Danny Milosavljevic 2020-08-14 13:47 ` Linux-libre 5.8 and beyond Alexandre Oliva 2020-08-15 6:03 ` Mark H Weaver 2020-08-16 1:24 ` Mark H Weaver 2020-08-16 12:43 ` Jason Self 2020-08-16 10:54 ` Jason Self 2020-08-24 3:45 ` Alexandre Oliva 2020-08-25 4:14 ` Mark H Weaver 2020-08-25 11:12 ` Alexandre Oliva 2020-08-24 3:58 ` Alexandre Oliva 2020-08-24 4:12 ` Alexandre Oliva 2020-08-24 4:34 ` Alexandre Oliva 2020-08-24 4:42 ` Alexandre Oliva
Code repositories for project(s) associated with this public inbox https://git.savannah.gnu.org/cgit/guix.git This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).