Re: Linux-libre 5.8 and beyond

all messages for Guix-related lists mirrored at yhetil.org
 help / color / mirror / code / Atom feed

* Re: Linux-libre 5.8 and beyond
@ 2020-08-09 20:15 Jason Self
  2020-08-13  0:39 ` Mark H Weaver
  0 siblings, 1 reply; 30+ messages in thread
From: Jason Self @ 2020-08-09 20:15 UTC (permalink / raw)
  To: guix-devel

[-- Attachment #1: Type: text/plain, Size: 2303 bytes --]

> the linux-libre project periodically deletes most of its older
> tarballs, even if there are no accidents.

Just FYI that git://linux-libre.fsfla.org/releases.git was created
mainly to solve that problem. Versions are now pretty much permanent.

> It may be useful for users with newer hardware devices, which are
> not yet well supported by the latest stable release, to use an
> arbitrary commit from either Linus' mainline git repository or some
> other subsystem tree.

The cleaning up scripts are version-specific and won't work on an 
"arbitrary commit from Linus's mainline git repository" (i.e., someone
wanting to get today's most recent commit going into 5.9.) The scripts
would fall over and die in such a scenario, or if forced to continue by
using --force the result would be incomplete cleaning. Using the
scripts on a version other than what the precise version that they were
intended for can also cause them to fail in obscure ways, as Vagrant
Cascadian has found out firsthand by running the 5.7 cleaning scripts
on 5.8 (that was determined to be the source of the problems they were
having.) If you look closely at the results of Vagrant Cascadian's
attempt, you'll see there was more than syntax errors: plenty of blobs
were certainly left in. Thus: As said, the clean up scripts can only be
used for the version that they were intended. Use with any other
version invites problems.

> It allows us to update to a new point version (which usually
> includes security fixes) more quickly, before the linux-libre
> project reacts.

Any attempt outrun the Linux-libre project and get updates out sooner
is unwise. While major new kernel releases will definitely require 
updates to the cleanup scripts, even minor patched versions 
occasionally require changes too. Updating to a new version prior to 
the Linux-libre project having had time to review that new version and 
determine if any updates are needed to the scripts risks introducing
freedom problems in the corresponding Guix version.

The moment that the Linux-libre project determines that scripts are
suitable is the moment that the new cleaned-up release is ready to
publish in git and the appropriate tags will then appear in git. The
compressed tarballs come some time later.

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Linux-libre 5.8 and beyond
  2020-08-09 20:15 Linux-libre 5.8 and beyond Jason Self
@ 2020-08-13  0:39 ` Mark H Weaver
  2020-08-13 16:47   ` Linux-libre git repository Vagrant Cascadian
  2020-08-14 13:47   ` Linux-libre 5.8 and beyond Alexandre Oliva
  0 siblings, 2 replies; 30+ messages in thread
From: Mark H Weaver @ 2020-08-13  0:39 UTC (permalink / raw)
  To: Jason Self; +Cc: guix-devel

Hi Jason,

I didn't see your email until just now.  I read this list only
sporadically, so it's best to keep me in the CC list for messages that
you'd like me to see, or that are responses to me.

Mark H Weaver <mhw@netris.org> wrote:
>> the linux-libre project periodically deletes most of its older
>> tarballs, even if there are no accidents.

Jason Self <jason@bluehome.net> responded:
> Just FYI that git://linux-libre.fsfla.org/releases.git was created
> mainly to solve that problem. Versions are now pretty much permanent.

That's helpful, thanks.  I didn't know about this.  Out of curiosity, is
this git repository advertised anywhere?  I wasn't able to easily find
it on <https://www.fsfla.org/ikiwiki/selibre/linux-libre/>, but I didn't
look carefully, perhaps I missed it.

One question: Would it solve the problem that I mentioned in my earlier
email, namely the problem of how to determine which precise commit
introduced a regression between two stable kernel releases?  If not, I
think that justifies the machinery that Guix includes to do the
deblobbing itself.

>> It may be useful for users with newer hardware devices, which are
>> not yet well supported by the latest stable release, to use an
>> arbitrary commit from either Linus' mainline git repository or some
>> other subsystem tree.
>
> The cleaning up scripts are version-specific and won't work on an
> "arbitrary commit from Linus's mainline git repository" (i.e., someone
> wanting to get today's most recent commit going into 5.9.) The scripts
> would fall over and die in such a scenario,

Okay, perhaps this was wishful thinking on my part.  I had hoped that
the deblob scripts would typically mostly work, even if they weren't
able to do a comprehensive cleaning.  I would oppose adding such a
partly-cleaned kernel to Guix itself, but I wanted to enable users who
need to use some other branch of Linux on their own systems to make a
best-effort cleaning.

>> It allows us to update to a new point version (which usually
>> includes security fixes) more quickly, before the linux-libre
>> project reacts.
>
> Any attempt outrun the Linux-libre project and get updates out sooner
> is unwise. While major new kernel releases will definitely require
> updates to the cleanup scripts, even minor patched versions
> occasionally require changes too. Updating to a new version prior to
> the Linux-libre project having had time to review that new version and
> determine if any updates are needed to the scripts risks introducing
> freedom problems in the corresponding Guix version.

In my experience, the deblob scripts are very rarely changed after the
first few point releases of a stable release series.  I know this
because I always check for updates to the deblob scripts whenever I
update linux-libre in Guix.  In practice, the deblob scripts used by
Guix are never more than 1 or 2 micro versions behind the version of
Linux they are applied to.

> The moment that the Linux-libre project determines that scripts are
> suitable is the moment that the new cleaned-up release is ready to
> publish in git and the appropriate tags will then appear in git. The
> compressed tarballs come some time later.

I prefer to avoid unnecessary delays when applying micro kernel updates,
because I assume that many of the fixes are potentially security fixes
(although they are rarely marked as such because upstream does not
attempt to determine the security relevance of most fixes, which is
reasonable).

I also consider it unwise for all of us, as a matter of habit or policy,
to trust the integrity of the computer systems used by the Linux-libre
project to perform the deblobbing.  It's not that I doubt the competence
of those people who maintain or administer those systems; it's that I
think it's unwise to trust *any* computer system that we can easily
avoid trusting.  Personally, I don't consider any modern civilian
computer system to be trustworthy, and especially not one that paints a
target on its back by being a potential vector for compromising the
machines of large numbers of users.

Enabling users to run the Linux-libre deblob scripts on their own
computers (as I do; I *never* use substitutes) enables them to remove
one computer system from the set of systems that they must trust.  I
think that's a good thing.

     Regards,
       Mark

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Linux-libre git repository
  2020-08-13  0:39 ` Mark H Weaver
@ 2020-08-13 16:47   ` Vagrant Cascadian
  2020-08-14  0:03     ` Jason Self
  2020-08-14 14:03     ` Danny Milosavljevic
  2020-08-14 13:47   ` Linux-libre 5.8 and beyond Alexandre Oliva
  1 sibling, 2 replies; 30+ messages in thread
From: Vagrant Cascadian @ 2020-08-13 16:47 UTC (permalink / raw)
  To: Mark H Weaver, Jason Self; +Cc: guix-devel

[-- Attachment #1: Type: text/plain, Size: 4903 bytes --]

On 2020-08-12, Mark H Weaver wrote:
> Mark H Weaver <mhw@netris.org> wrote:
>>> the linux-libre project periodically deletes most of its older
>>> tarballs, even if there are no accidents.
>
> Jason Self <jason@bluehome.net> responded:
>> Just FYI that git://linux-libre.fsfla.org/releases.git was created
>> mainly to solve that problem. Versions are now pretty much permanent.
>
> That's helpful, thanks.  I didn't know about this.  Out of curiosity, is
> this git repository advertised anywhere?  I wasn't able to easily find
> it on <https://www.fsfla.org/ikiwiki/selibre/linux-libre/>, but I didn't
> look carefully, perhaps I missed it.

News item for 2020-05-31 mentions it, but clearly it should be more
prominently displayed or documented.

> One question: Would it solve the problem that I mentioned in my earlier
> email, namely the problem of how to determine which precise commit
> introduced a regression between two stable kernel releases?  If not, I
> think that justifies the machinery that Guix includes to do the
> deblobbing itself.

The granularity appears to be at the level of released tags. I see tags
with one commit per release, with an independent history from previous
versions; I *think* git bisect wouldn't work without some manual
fiddling and you'd have to manually bisect based on version.

I tried a quick experiment using the linux-libre git repository to build
a package for arm64 in gnu/packages/linux.scm:

(define-public linux-libre-fsfla-git-arm64-generic
  (let* ((version "5.8.1-gnu")
         (source
          (origin
            (method git-fetch)
            (uri (git-reference
                  (url "git://linux-libre.fsfla.org/releases.git")
                  (commit (string-append "sources/v" version))))
            (file-name (git-file-name "linux-libre-fsfla-git" version))
            (sha256
             (base32
              "05v2l4r34nbkv6wpgrzydlb0fkpswpvzdya9vx30wap3n9a9wp6n"))
           (patches
            (list %boot-logo-patch
                  %linux-libre-arm-export-__sync_icache_dcache-patch)))))
    (make-linux-libre*
     version
     source
     '("aarch64-linux")
     #:defconfig "defconfig"
     #:extra-version "fsfla-git-arm64-generic")))

The source checkout was quite slow to download, and took up ~1GB in the
store once completed. I'm not sure how guix's git origin works exactly;
if it downloads the entire git history even to perform a shallow
checkout of a single commit, and then throws out the git history? It did
appear to be calling git with flags to perform a shallow checkout.

It certainly was slower than downloading a compressed tarball. The
de-duplication of /gnu/store might still be beneficial if you have
significantly more than ~10 versions in /gnu/store, as not every file
changes with every release, but overall using compressed tarballs seems
to be faster to download and extract even on a slow machine.

This partly points to challenges with guix's handling of git
repositories, exacerbated by larger git repositories. It would be more
viable if there was some way to cache git results such as running "git
clone --bare ~/.cache/guix/..." if not present, and "git fetch origin"
if present and then populating the store from cached git repository,
much like done with "guix pull" ... Surely this has been brought up
before? Maybe this breaks the purity of guix's functional paradigm, but
arguably no more than a caching http proxy really.

It is also possible to retrieve tarballs directly from linux-libre git
tags, though I know at least projects hosted on github this does
occasionally result in non-identical tarballs. Not sure what factors
might trigger this, other than changing tags, but possibly different git
versions, tar versions and flags, and compression tool versions and
optimizations could be a factor. Reproducible builds has documented some
potential causes:

  https://reproducible-builds.org/docs/archives/

There are also the released linux-libre tarballs, though that may have
the persistence issue previously mentioned. The code to do so is still
present in guix, I made a package using:

(define-public linux-libre-fsfla-arm64-generic
  (make-linux-libre "5.8.1"
                    "1v7glmvz3laj1awh5zrqclp2pzfs0cjf6y3n6v97j7z901s1vlxd"
                   '("aarch64-linux")
                   #:defconfig "defconfig"
                   #:extra-version "fsfla-arm64-generic"))

After patching the make-linux-libre call to also include a patch needed
for newer versions:

-                           (patches (list %boot-logo-patch)))
+                           (patches (list %boot-logo-patch
+                                         %linux-libre-arm-export-__sync_icache_dcache-patch)))

Not sure why that patch isn't upstream; Debian has been carrying it for
some years now... and my guix build failed to build without it on
aarch64/arm64.

live well,
  vagrant

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 227 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Linux-libre git repository
  2020-08-13 16:47   ` Linux-libre git repository Vagrant Cascadian
@ 2020-08-14  0:03     ` Jason Self
  2020-08-14 14:03     ` Danny Milosavljevic
  1 sibling, 0 replies; 30+ messages in thread
From: Jason Self @ 2020-08-14  0:03 UTC (permalink / raw)
  To: Vagrant Cascadian; +Cc: guix-devel

[-- Attachment #1: Type: text/plain, Size: 2375 bytes --]

On Thu, 13 Aug 2020 09:47:21 -0700
Vagrant Cascadian <vagrant@reproducible-builds.org> wrote:

> It is also possible to retrieve tarballs directly from linux-libre git
> tags, though I know at least projects hosted on github this does
> occasionally result in non-identical tarballs. Not sure what factors
> might trigger this, other than changing tags, but possibly different
> git versions, tar versions and flags, and compression tool versions
> and optimizations could be a factor. Reproducible builds has
> documented some potential causes:

Adding in compression changes this because, for just one example,
compression details can change between versions of compressors.

Assuming that there is no compression and there aren't changes in the
underlying git repository and assuming that git archive is invoked with
precisely the same parameters each time, git archive is supposed to
generate bit-identical tarballs between different platforms/versions of
git (it's considered a bug if it doesn't.)

Indeed, the Linux stable tree takes advantage of this reproducibility by
adding a GPG signature for the uncompressed tarballs as a git note under
refs/notes/signatures/tar. The signature also includes a comment
with the precise command to regenerate the uncompressed tarball with
git archive. This then makes it possible to verify a GPG signature of an
uncompressed tarball that way. An example is [0]. cgit automatically
adds the (sig) link when the corresponding git note is added in
refs/notes/signatures/tar but they can also be accessed directly from
within git.

I found that useful after learning that GPG signatures within git itself
"only validate the commit file contents up to the SHA-1 of the top level
tree, it's not a GPG signature of the entire tree state. This means that a
SHA-1 collision on the tree object, or any blob object, still results
in a valid GPG signature."

It seemed to be a neat way to sidestep the whole matter of SHA-1 falling
apart, at least until git moves on to SHA-2 at some as-yet-unknown
future point.

Anyway, the Linux-libre git repository similarly contains GPG
signatures for the uncompressed tarballs but as tags not as a git note
but either way the outcome is the same.

[0] https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/

refs/notes/signatures/tar

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Linux-libre git repository
  2020-08-13 16:47   ` Linux-libre git repository Vagrant Cascadian
  2020-08-14  0:03     ` Jason Self
@ 2020-08-14 14:03     ` Danny Milosavljevic
  1 sibling, 0 replies; 30+ messages in thread
From: Danny Milosavljevic @ 2020-08-14 14:03 UTC (permalink / raw)
  To: Vagrant Cascadian; +Cc: guix-devel, Jason Self

[-- Attachment #1: Type: text/plain, Size: 1219 bytes --]

Hi Vagrant,

On Thu, 13 Aug 2020 09:47:21 -0700
Vagrant Cascadian <vagrant@reproducible-builds.org> wrote:

> The source checkout was quite slow to download, and took up ~1GB in the
> store once completed. I'm not sure how guix's git origin works exactly;

git init
git remote add origin <url>
if git fetch --depth 1 origin <commit>
then
  git checkout FETCH_HEAD
else
  echo "Failed to do a shallow fetch; retrying a full fetch..."
  git fetch origin
  git checkout <commit>
fi
if ,recursive?
then
  git submodule update --init --recursive
  rm -rf .git for each submodule
fi
rm -rf .git

See guix/build/git.scm .

There exist git servers that have disabled fetching by commit hash for
"security" reasons (if you checked in a file containing a password and
then removed it again, and no branch or tag to it exists, nobody can
get to it even if he knew the commit hash).  We would always use the
fallback for those servers.

> if it downloads the entire git history even to perform a shallow
> checkout of a single commit, and then throws out the git history?

As a fallback if the above doesn't work.

> appear to be calling git with flags to perform a shallow checkout.

Yes.

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Linux-libre 5.8 and beyond
  2020-08-13  0:39 ` Mark H Weaver
  2020-08-13 16:47   ` Linux-libre git repository Vagrant Cascadian
@ 2020-08-14 13:47   ` Alexandre Oliva
  2020-08-15  6:03     ` Mark H Weaver
  1 sibling, 1 reply; 30+ messages in thread
From: Alexandre Oliva @ 2020-08-14 13:47 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guix-devel, Jason Self

Hello, Mark,

On Aug 12, 2020, Mark H Weaver <mhw@netris.org> wrote:

> Mark H Weaver <mhw@netris.org> wrote:
>>> the linux-libre project periodically deletes most of its older
>>> tarballs, even if there are no accidents.

> Jason Self <jason@bluehome.net> responded:
>> Just FYI that git://linux-libre.fsfla.org/releases.git was created
>> mainly to solve that problem. Versions are now pretty much permanent.

> That's helpful, thanks.  I didn't know about this.  Out of curiosity, is
> this git repository advertised anywhere?

Not much.  It was mentioned back in the announcements of 5.7-gnu and a
few subsequent ones on social media; in the 5.7-gnu news entry in the
Linux-libre web site, and in the documentation we wrote for Guix
developers, that was sent to some of you not long ago.

Though it was announced sort of widely, since this move was directed
primarily at satisfying a Guix pain point, I figured I'd add it to
downloads only after making sure it did address Guix's needs, so that,
should it require significant changes, there wouldn't have to be much
concern about backward compatibility with the current status quo.

> One question: Would it solve the problem that I mentioned in my earlier
> email, namely the problem of how to determine which precise commit
> introduced a regression between two stable kernel releases?

No.  There are much better (faster and less risky) ways to tend to that
requirement, see #bisecting below.

>>> It may be useful for users with newer hardware devices, which are
>>> not yet well supported by the latest stable release, to use an
>>> arbitrary commit from either Linus' mainline git repository or some
>>> other subsystem tree.
>> 
>> The cleaning up scripts are version-specific and won't work on an
>> "arbitrary commit from Linus's mainline git repository" (i.e., someone
>> wanting to get today's most recent commit going into 5.9.) The scripts
>> would fall over and die in such a scenario,

> Okay, perhaps this was wishful thinking on my part.

Yup.  If you ran a deblob-check in verify mode on the resulting
tarballs, you'd see how error-prone this is.  You'd at least stop
non-Free code from silently sneaking in and finding its way into running
on users' machines.  That's the *least* someone who runs the
deblob-scripts on their own should do to smoke-test the result WRT
*known* freedom issues.

> I had hoped that the deblob scripts would typically mostly work, even
> if they weren't able to do a comprehensive cleaning.

I'd honestly hope for a much higher standard than that for a
FSDG-compliant distro, especially one that carries the GNU mark.

> I would oppose adding such a partly-cleaned kernel to Guix itself,

But you don't!  That's what you get when you jump the gun and use
outdated cleaning up scripts, without waiting for us to verify,
update and release them for a newer version.

> but I wanted to enable users who need to use some other branch of
> Linux on their own systems to make a best-effort cleaning.

Besides the likelihood of something going wrong, that seems like a
backwards goal for a distro that is not expected to as much as point
users at a non-Free package.

I'm sure that's not what you intend, but this arrangement, plus your
mention of hurriedly getting releases out, adds up to an incentive to
disable the deblobbing so as to get a faster build.  I hope you'll agree
that this is undesirable.  As for how to speed up builds without
sacrificing freedom, see below.

>>> It allows us to update to a new point version (which usually
>>> includes security fixes) more quickly, before the linux-libre
>>> project reacts.
>> 
>> Any attempt outrun the Linux-libre project and get updates out sooner
>> is unwise. While major new kernel releases will definitely require
>> updates to the cleanup scripts, even minor patched versions
>> occasionally require changes too. Updating to a new version prior to
>> the Linux-libre project having had time to review that new version and
>> determine if any updates are needed to the scripts risks introducing
>> freedom problems in the corresponding Guix version.

> In my experience, the deblob scripts are very rarely changed after the
> first few point releases of a stable release series.

My personal experience tells me otherwise.  5.7 had only one update at
.8; 5.6, at .6 and .16; 5.5, at .3, .11 and .19; 5.4, at .14, .18, .27,
.34 and .44; 5.3, at .4 and .11; 5.2 at .1, .3 and .11; 5.1 at .2, .18
and .20; 5.0 at .7 and .16.  What you describe was true only of 4.17,
4.10, 4.3, 3.13, 3.5, and 3.2, i.e. 6 out of the 50 major releases
starting at 3.0.

> I know this because I always check for updates to the deblob scripts
> whenever I update linux-libre in Guix.  In practice, the deblob scripts used by
> Guix are never more than 1 or 2 micro versions behind the version of
> Linux they are applied to.

There have been 61 script updates for the 1274 4.*.*-gnu* and 5.*.*-gnu*
stable releases, so Guix has shipped potentially non-FSDG code, that
*would* have been flagged by deblob-check on the tarballs, at between 5%
and 10% of these releases.  Does that sound like a good standard for a
freedom-first distro to aim for?

>> The moment that the Linux-libre project determines that scripts are
>> suitable is the moment that the new cleaned-up release is ready to
>> publish in git and the appropriate tags will then appear in git. The
>> compressed tarballs come some time later.

> I prefer to avoid unnecessary delays when applying micro kernel updates,

Sorry, but it doesn't look like you do.  If you did, you would be taking
a cleaned up tree instead of re-deblobbing it.  You skip even the
automated verification we do, which saves you some time, but at what
price?

If you waited another 30 minutes for our cleaned-up and verified tree to
be available from git, you'd save yourself the 20 minutes of cleaning-up
and another 20 minutes of deblob-checking the tarball for known or
likely freedom issues.  That sounds like a net win to me.

Now, if your build machines clean up and verify much faster than ours,
I'd be pretty glad to use them to get the verified commits in place so
that you could use them faster.

> I also consider it unwise for all of us, as a matter of habit or policy,
> to trust the integrity of the computer systems used by the Linux-libre
> project to perform the deblobbing.

I welcome double-checking of our cleaning up at all levels, but why are
you setting a higher trust standard for us than for a project known to
be at odds with our shared goals, such as Linux?  You don't apply the
patches that went into it since the last known good release to
double-check their releases, do you?  For most projects, you just take
their tarballs or tags and build it.  For Linux-libre, you start from
(untrustworthy) Linux, run the (presumed untrustworthy) cleaning up
scripts, and blindly trust the result.  There's no self-verification run
with deblob-check, no compare with our release, nothing.

If you were to test the integrity of our releases, you'd think you'd at
least look at them.  Starting from a known-good Linux release and
applying patches to double-check the results is expensive, so it makes
sense to do that only occasionally, rather than as part of every build.
Deblobbing and checking the result is also expensive, so it also makes
sense for you to do so only occasionally, rather than as part of every
build.

But the point stands that, for someone who'd rather trust no one, you're
blindly trusting both Linux and Linux-libre.  The former when it comes
to base releases you don't check; the latter when it comes to scripts
whose results you hardly even look at.  Why not reduce your trust base
to just Linux-libre, and treat is as a citizen of the same class as
nearly every other project you build, and satisfy your trust-but-verify
needs looking into what changes between one of our releases and another?

#bisecting

You can even take one of our releases and apply the patches that went
into the next upstream stable release, and check that what you get
matches our own corresponding release.  Some 98% of the time, they will
be exact matches.  Occasionally, there will be a difference, and then
you'll likely find a corresponding change in the deblobbing scripts, or
a preexisting pattern that caused the change.  We do this for every
release, as part of our pre-release checks, and you're welcome to do so
as well, and to use the resulting tree to bisect problems.

You'll see the builds are much faster if you don't have to deblob every
build.

Now, we've long had plans to publish a cleaned-up repo with Linux git
history.  It would take a massive amount of work to get it started, but
after that, getting new releases out might be about as fast as running a
git merge.  Ok, not that fast because there'd be some checking, but you
get the idea.

Other ideas to speed up our release process are:

- enabling cleaning up in multiple concurrent processes, like 'make -j'

- ditto for the deblob-check tarball verification

- use faster machines for the above

- monitor upstream git and fire up cleaning up automatically, and move
the various manual signatures of commits, tarballs, and logs to the very
end, after manual checking

These would enable the git commits to be out sooner.  Currently, our
best case is to push a release to the git release archive about one hour
after the upstream release, but we can only get that realistically when
there's only one release to do.  Most often, there are four to seven
releases at once, and then, since we don't always react immediately (one
needs to sleep occasionally ;-) and we don't get early warnings,
especially about major security issues, and our main workhorse has
limited capacity, we end up at about one hour per release anyway, having
them all ready at about the same time.

(Compression of tarballs then takes another half hour or more per
release, but the releases are pushed to the git release archive long
before compression is completed)

-- 
Alexandre Oliva, happy hacker
https://FSFLA.org/blogs/lxo/
Free Software Activist
GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Linux-libre 5.8 and beyond
  2020-08-14 13:47   ` Linux-libre 5.8 and beyond Alexandre Oliva
@ 2020-08-15  6:03     ` Mark H Weaver
  2020-08-16  1:24       ` Mark H Weaver
                         ` (6 more replies)
  0 siblings, 7 replies; 30+ messages in thread
From: Mark H Weaver @ 2020-08-15  6:03 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: guix-devel, Jason Self

Hi Alexandre,

Alexandre Oliva <lxoliva@fsfla.org> wrote:
> On Aug 12, 2020, Mark H Weaver <mhw@netris.org> wrote:
>
>>>> It may be useful for users with newer hardware devices, which are
>>>> not yet well supported by the latest stable release, to use an
>>>> arbitrary commit from either Linus' mainline git repository or some
>>>> other subsystem tree.
>>> 
>>> The cleaning up scripts are version-specific and won't work on an
>>> "arbitrary commit from Linus's mainline git repository" (i.e., someone
>>> wanting to get today's most recent commit going into 5.9.) The scripts
>>> would fall over and die in such a scenario,
>
>> Okay, perhaps this was wishful thinking on my part.
>
> Yup.  If you ran a deblob-check in verify mode on the resulting
> tarballs, you'd see how error-prone this is.  You'd at least stop
> non-Free code from silently sneaking in and finding its way into running
> on users' machines.  That's the *least* someone who runs the
> deblob-scripts on their own should do to smoke-test the result WRT
> *known* freedom issues.

What is this "verify mode" that you're referring to, and where is it
documented?  The word "verify" does not occur in either of the deblob
scripts that I know about, namely "deblob-<VERSION>" and "deblob-check".
The string "verif" occurs a few times, but nothing related to the script
functionality.  I don't see anything like a verification mode mentioned
in the options documented at the top of those two scripts.

For the record, it was not my intent to skip any automated checking
provided by these scripts.  If we're running the scripts in a suboptimal
way, please tell me a better way.

FYI, right now we're simply running the main 'deblob-<VERSION>' script
with no arguments in the unpacked Linux source directory, with the
corresponding 'deblob-check' script in $PATH and $PYTHON pointing to
python 2.x.  If 'deblob-<VERSION>' exits abnormally or with a non-zero
result, the Guix build process fails.

Last I checked, 'deblob-check' was certainly being run by
'deblob-<VERSION>' as a subprocess, because I had to make several
substitutions of hard-coded paths before it would work in Guix
(e.g. /bin/sed and /usr/bin/python).

>> I had hoped that the deblob scripts would typically mostly work, even
>> if they weren't able to do a comprehensive cleaning.
>
> I'd honestly hope for a much higher standard than that for a
> FSDG-compliant distro, especially one that carries the GNU mark.

As I wrote below:

>> I would oppose adding such a partly-cleaned kernel to Guix itself,

With this in mind, your accusation above is not relevant to Guix.

Above, I was talking about my hope to enable users, *on their own
machines* and using *their own private build recipes*, to make a
best-effort deblobbing of a non-standard kernel variant that they need
to use for whatever reason.  If they aren't provided with that option,
the obvious alternative (which I expect 99% of such users would do
anyway) is to simply run a fully-blobbed kernel instead.

> But you don't!  That's what you get when you jump the gun and use
> outdated cleaning up scripts, without waiting for us to verify,
> update and release them for a newer version.

Here you are conflating two substantially different scenarios:

(1) Attempting to use your deblob scripts on a newer kernel that almost
    certainly includes many new drivers and blobs that aren't detected
    by your scripts.  That's the case that I said I would oppose for
    inclusion in Guix.

(2) Using the deblob scripts made for 5.4.57 on a 5.4.58 kernel in order
    to apply security fixes more quickly, and where the probability of
    uncleaned new blobs is quite low.

>> but I wanted to enable users who need to use some other branch of
>> Linux on their own systems to make a best-effort cleaning.
>
> Besides the likelihood of something going wrong, that seems like a
> backwards goal for a distro that is not expected to as much as point
> users at a non-Free package.

It's *not* a goal for Guix, and it wasn't even my motivation for
teaching Guix to run the Linux-libre deblob scripts.  It's just
something that, on a whim, I chose to include in my list of possible
advantages to having such functionality, nothing more.

> I'm sure that's not what you intend, but this arrangement, plus your
> mention of hurriedly getting releases out, adds up to an incentive to
> disable the deblobbing so as to get a faster build.

I don't understand how you reached this conclusion.  As far as I can
tell, changing Guix to run the deblob scripts made *no* difference to
what someone would have to do to ask Guix to build fully-blobbed kernel.

> I hope you'll agree that this is undesirable.

Agreed.

>> In my experience, the deblob scripts are very rarely changed after the
>> first few point releases of a stable release series.
>
> My personal experience tells me otherwise.  5.7 had only one update at
> .8; 5.6, at .6 and .16; 5.5, at .3, .11 and .19; 5.4, at .14, .18, .27,
> .34 and .44; 5.3, at .4 and .11; 5.2 at .1, .3 and .11; 5.1 at .2, .18
> and .20; 5.0 at .7 and .16.  What you describe was true only of 4.17,
> 4.10, 4.3, 3.13, 3.5, and 3.2, i.e. 6 out of the 50 major releases
> starting at 3.0.

I only checked your claims regarding 5.4, and found that you're mistaken
about them being updated in 5.4.44.  In fact, the 'deblob-5.4' and
'deblob-check' files, as found in /pub/linux-libre/releases/, have not
changed since version 5.4.34.

Moreover, of the 4 deblob updates (.14, .18, .27, and .34) that have
*actually* been made so far during the 5.4.x series, IIUC only one of
them declared new blobs to remove, namely the update for 5.4.27.

The 5.4.14 update only removed extraneous backslashes in existing
regexps, changing "\e" to "e" and "\@" to "@".  I don't know whether
these extraneous backslashes caused blobs to be included in the
linux-libre tarballs, but if so, that presumably already happened in
5.4.13 and would have happened even if we had used your official
tarballs, no?

The 5.4.18 and 5.4.34 updates only added new 'accept' directives.  I
guess that means that temporarily omitting these additions wouldn't
cause new blobs to be included, is that right?

>> I know this because I always check for updates to the deblob scripts
>> whenever I update linux-libre in Guix.  In practice, the deblob scripts used by
>> Guix are never more than 1 or 2 micro versions behind the version of
>> Linux they are applied to.
>
> There have been 61 script updates for the 1274 4.*.*-gnu* and 5.*.*-gnu*
> stable releases, so Guix has shipped potentially non-FSDG code, that
> *would* have been flagged by deblob-check on the tarballs, at between 5%
> and 10% of these releases.  Does that sound like a good standard for a
> freedom-first distro to aim for?

If it were true that we've been including blobs in 5-10% of our
linux-libre releases, I agree that would be a serious problem.  However,
I believe your estimates are way off, so I took a closer look at the
statistics for the 5.4, 4.19, and 4.14 kernels.

I already wrote about 5.4 above.  If we include only the deblob updates
that added checks for new blobs, it's only happened once in 58 upstream
updates, i.e. for 1.7% of the updates.

In the 4.19 series, although the deblob scripts have been updated 8
times, of those 8, 3 only add 'accept' directives, and a fourth only
makes the same regexp fixes mentioned above ("\e" -> "e" and "\@" ->
"@").  In other words, only 4 of these deblob updates might result in
new blobs being recognized.  So that's 4 new blob updates out of 139
upstream updates, which comes out to 2.9%.

In the 4.14 series, the deblob scripts were updated 6 times, but 3 only
add 'accept' directives and a fourth only makes the regexp fix.  So that
comes out to 2 new blob updates out of 193 upstream updates, which comes
out to 1.0%.

So, unless I missing something, it's more accurate to say that when I
push a Linux-libre security update before waiting for you to bless it,
I'm taking a 1-3% risk that a blob might end up in the result.

I find that level of risk undesirable.  I would certainly rather avoid
it.  I guess where you and I differ is that I *also* find it undesirable
to subject our users to unnecessary delays in getting these security
updates, because that *also* carries a risk, namely the risk that their
systems will be compromised due to a delayed security update.

To my mind, it makes sense to balance these two risks, especially since
we know that it's simply impractical to completely eliminate the risk of
non-FSDG-compliant code occasionally finding its way into Guix.

>>> The moment that the Linux-libre project determines that scripts are
>>> suitable is the moment that the new cleaned-up release is ready to
>>> publish in git and the appropriate tags will then appear in git. The
>>> compressed tarballs come some time later.
>
>> I prefer to avoid unnecessary delays when applying micro kernel updates,
>
> Sorry, but it doesn't look like you do.  If you did, you would be taking
> a cleaned up tree instead of re-deblobbing it.

I'm not concerned about another 30 minutes (or whatever) to run the
deblob scripts, especially if the alternative is to trust the integrity
of your machines unnecessarily.  The delays I'd prefer to avoid are ones
measured in tens of hours, which is occasionally how long it takes
before Linux-libre reacts to a new upstream update.

> You skip even the automated verification we do, which saves you some
> time, but at what price?

As I wrote above, if there's some automated verification that we are
failing to do, please tell me how to do it.  It was certainly not my
intent to skip any such verification.

>> I also consider it unwise for all of us, as a matter of habit or policy,
>> to trust the integrity of the computer systems used by the Linux-libre
>> project to perform the deblobbing.
>
> I welcome double-checking of our cleaning up at all levels, but why are
> you setting a higher trust standard for us than for a project known to
> be at odds with our shared goals, such as Linux?

I don't understand how you reached the conclusion that I'm setting a
higher trust standard for Linux-libre than for Linux.  The principle I'm
following here is simply to avoid relying on the integrity of any system
if I can easily avoid it.  In particular, if I can easily run an
automated process on my own machine instead of relying on some other
system to provide pre-generated outputs for me, then I prefer to do it
myself.

> You don't apply the patches that went into it since the last known
> good release to double-check their releases, do you?  For most
> projects, you just take their tarballs or tags and build it.

That's true, and I agree that it's something we could improve on.
It would be preferable to fetch from a git repository instead, and
preferably one that has a lot of eyes on it.

> For Linux-libre, you start from (untrustworthy) Linux, run the
> (presumed untrustworthy) cleaning up scripts, and blindly trust the
> result.

I agree that we cannot avoid trusting many people and systems, and that
in most cases that trust is blind.  In this case, we cannot avoid
trusting the Linux source code (even if we download exclusively from the
Linux-libre project), and we cannot avoid trusting the Linux-libre
deblob scripts.  However, I reject the argument that because we must
trust X and Y, we might as well trust Z as well.

> There's no self-verification run with deblob-check,

Again, if we're failing to do that, it's a bug that has not previously
been brought to my attention.  See above.

> no compare with our release, nothing.  If you were to test the
> integrity of our releases, you'd think you'd at least look at them.

I *did* compare with your releases when I first taught Guix how to run
the deblob scripts, but not since then.  Anyway, I fail to see the
relevance of this fact.  I agree that it would be useful for someone
running Guix to compare our generated tarballs to yours.  There are
millions of useful things I *could* do with my time, but alas, my
energies are limited.

> If you were to test the integrity of our releases, you'd think you'd at
> least look at them.  Starting from a known-good Linux release and
> applying patches to double-check the results is expensive, so it makes
> sense to do that only occasionally, rather than as part of every build.
> Deblobbing and checking the result is also expensive, so it also makes
> sense for you to do so only occasionally, rather than as part of every
> build.

As far as I can tell, the vast majority of Guix users use substitutes
provided by its build farm.  I guess that it's fairly rare for people to
build everything on their own machines, as I do.

> But the point stands that, for someone who'd rather trust no one, you're
> blindly trusting both Linux and Linux-libre.  The former when it comes
> to base releases you don't check; the latter when it comes to scripts
> whose results you hardly even look at.  Why not reduce your trust base
> to just Linux-libre,

That's not possible.  Clearly, you do not have the capacity to audit all
of the code that Linux produces.  Therefore, by trusting Linux-libre, we
must implicitly also trust the Linux project.  That much we cannot
avoid.  We also cannot avoid trusting your deblob scripts.

However, we *can* easily avoid trusting the integrity of the systems
that you use to run the deblob scripts.

> and treat is as a citizen of the same class as
> nearly every other project you build, and satisfy your trust-but-verify
> needs looking into what changes between one of our releases and another?

You seem to be suggesting that I'm treating Linux-libre with less
respect than other projects in Guix.  I reject that claim.

In fact, I strongly support reducing Guix's reliance on pre-generated
outputs produced by *any* project.  I'm not singling out the Linux-libre
project here.

For example, one of the things I'm recently been thinking about is that
Guix currently trusts the integrity of all the scripts generated by
autoconf/automake/libtool/etc for most of the tarballs that we download.
Those scripts are generated on random developer machines, and they are
very difficult to reproduce, because they depend on the precise versions
of many other packages, and Debian also seems to have extensively
modified their automake, leading to other differences.  I would be in
favor of working toward generating those scripts in Guix itself where
possible, but it's a big job and likely to cause maintenance headaches.

For another example, I also taught Guix how to generate the IceCat
source tarball from the corresponding Firefox tarball, and I intend to
keep it that way, although I'm currently an IceCat maintainer.

>> One question: Would it solve the problem that I mentioned in my earlier
>> email, namely the problem of how to determine which precise commit
>> introduced a regression between two stable kernel releases?
>
> No.  There are much better (faster and less risky) ways to tend to that
> requirement, see #bisecting below.
[...]
> #bisecting
>
> You can even take one of our releases and apply the patches that went
> into the next upstream stable release, and check that what you get
> matches our own corresponding release.  Some 98% of the time, they will
> be exact matches.  Occasionally, there will be a difference, and then
> you'll likely find a corresponding change in the deblobbing scripts, or
> a preexisting pattern that caused the change.  We do this for every
> release, as part of our pre-release checks, and you're welcome to do so
> as well, and to use the resulting tree to bisect problems.

I agree that this would be faster, but I fail to see how it's "less
risky" than running the deblob scripts meant for Linux-libre X.Y.Z on a
git checkout of the upstream stable git repo between X.Y.(Z-1) and
X.Y.Z.

More importantly, it's a much less straightforward thing to implement.
In the current implementation, we get the ability to deblob arbitrary
git commits from the same stable branch essentially for free.

I guess you're suggesting that I should implement a radically different
mechanism specifically for this purpose, that extracts the individual
patches from the upstream stable git repository, attempt to apply them
to the base Linux-libre release, compare that to the next Linux-libre
release, and then implement my own bisection functionality.

If I were to implement this, what would you suggest I do if the patches
fail to apply, or if the result fails to match the next Linux-libre
release?

     Thanks,
       Mark

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Linux-libre 5.8 and beyond
  2020-08-15  6:03     ` Mark H Weaver
@ 2020-08-16  1:24       ` Mark H Weaver
  2020-08-16 12:43         ` Jason Self
  2020-08-16 10:54       ` Jason Self
                         ` (5 subsequent siblings)
  6 siblings, 1 reply; 30+ messages in thread
From: Mark H Weaver @ 2020-08-16  1:24 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: guix-devel, Jason Self

Hi Alexandre,

I thought about it some more, and I've changed my mind on one point:
I've decided that for future kernel updates, in order to eliminate the
risk of unintentionally allowing blobs into Guix, I will either wait for
Linux-libre to publish updated deblob scripts, or else I will manually
check for new blobs.

     Thanks,
       Mark

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Linux-libre 5.8 and beyond
  2020-08-16  1:24       ` Mark H Weaver
@ 2020-08-16 12:43         ` Jason Self
  0 siblings, 0 replies; 30+ messages in thread
From: Jason Self @ 2020-08-16 12:43 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guix-devel

[-- Attachment #1: Type: text/plain, Size: 2311 bytes --]

On Sat, 15 Aug 2020 21:24:08 -0400
Mark H Weaver <mhw@netris.org> wrote:

> Hi Alexandre,
> 
> I thought about it some more, and I've changed my mind on one point:
> I've decided that for future kernel updates, in order to eliminate the
> risk of unintentionally allowing blobs into Guix, I will either wait
> for Linux-libre to publish updated deblob scripts, or else I will
> manually check for new blobs.

This can be determined by checking for the availability of the new
kernel version in git. The git repository is updated first, prior to
tarballs being created so I assume you'd want to be looking there given
that speed of updates seems important. If the new kernel version
appears without a corresponding script update then you can know that no
script updates were determined to be necessary.

Wouldn't a better setup be to obtain the desired kernel version from
Linux-libre, obtain the desired kernel version from kernel.org,
independently run the clean-up scripts, and then toss out the results
from kernel.org once the source code is determined to be identical?*

I mean, if you're already willing to wait until the analysis of whether
updated cleanup scripts are needed or not has been done, then you're
already at the point of the Linux-libre kernel source code being
available too because once that determination is made, any updated
scripts and the corresponding kernel source code are pushed into git
simultaneously.

Confirming if the results you get from the cleanup scripts are the same
is helpful all around. It is not necessary to trust the Linux-libre
project infrastructure because you're also verifying the integrity and
also gets you access to the double verification steps that are done
which check that the version does in fact correspond to the upstream
version plus the changes that Linux-libre made, and that it also
corresponds to the previous release plus the incremental patches.

* As a disclaimer there may be one difference in that the clean-up
  scripts will in some cases delete all of the files in a directory
  while leaving the directory itself in place. Git doesn't track empty
  directories and so diffing of the entire kernel source code would
  reveal that. The diff should otherwise report everything to be
  identical.

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Linux-libre 5.8 and beyond
  2020-08-15  6:03     ` Mark H Weaver
  2020-08-16  1:24       ` Mark H Weaver
@ 2020-08-16 10:54       ` Jason Self
  2020-08-24  3:45       ` Alexandre Oliva
                         ` (4 subsequent siblings)
  6 siblings, 0 replies; 30+ messages in thread
From: Jason Self @ 2020-08-16 10:54 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guix-devel

[-- Attachment #1: Type: text/plain, Size: 100 bytes --]

I always thought the reproducible builds mantra was "trust but verify",
not to actively distrust?

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 801 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Linux-libre 5.8 and beyond
  2020-08-15  6:03     ` Mark H Weaver
  2020-08-16  1:24       ` Mark H Weaver
  2020-08-16 10:54       ` Jason Self
@ 2020-08-24  3:45       ` Alexandre Oliva
  2020-08-25  4:14         ` Mark H Weaver
  2020-08-24  3:58       ` Alexandre Oliva
                         ` (3 subsequent siblings)
  6 siblings, 1 reply; 30+ messages in thread
From: Alexandre Oliva @ 2020-08-24  3:45 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guix-devel, Jason Self

Hello, Mark,

Apologies for the delay in responding.  It's been an "interesting" week.

I'm breaking up what turned out to be a very very long reply into
multiple posts, so as to address the various issues in separate posts,
that might very well turn into separate subthreads.

On Aug 15, 2020, Mark H Weaver <mhw@netris.org> wrote:

> Alexandre Oliva <lxoliva@fsfla.org> wrote:

>> No.  There are much better (faster and less risky) ways to tend to that
>> requirement, see #bisecting below.
> [...]
>> #bisecting
>> 
>> You can even take one of our releases and apply the patches that went
>> into the next upstream stable release, and check that what you get
>> matches our own corresponding release.  Some 98% of the time, they will
>> be exact matches.  Occasionally, there will be a difference, and then
>> you'll likely find a corresponding change in the deblobbing scripts, or
>> a preexisting pattern that caused the change.  We do this for every
>> release, as part of our pre-release checks, and you're welcome to do so
>> as well, and to use the resulting tree to bisect problems.

> I agree that this would be faster, but I fail to see how it's "less
> risky" than running the deblob scripts meant for Linux-libre X.Y.Z on a
> git checkout of the upstream stable git repo between X.Y.(Z-1) and
> X.Y.Z.

You're right.  My mistake was failing to mention the need to compare
between X.Y.Z-gnu* and the result of rebasing the X.Y.(Z-1)..X.Y.Z
patches onto X.Y.(Z-1)-gnu* to enjoy the safety I had in mind.

If the changes made to both ends are the same, then it's not entirely
unreasonable to assume that all intermediate commits would also be
properly cleaned up with the same set of changes, assuming the history
in between is linear (i.e., only cherry-picks, not merges that could
take a bisect to much earlier commits).

Even with linear history, you might admittedly still be surprised if an
intervening patch introduces something undesirable and a subsequent one
reverses it.  Running through deblob-check each version of each modified
file that matches neither boundary would mechanically avoid nearly all
such surprises.

> More importantly, it's a much less straightforward thing to implement.

It really isn't.  Once you remove the deblobbing and point the recipe
directly at the linux-libre git repo, users will be able to do the above
rebasing on a local repo, and build reasonably quickly any of the
commits that the local git bisect tells them to try.

> In the current implementation, we get the ability to deblob arbitrary
> git commits from the same stable branch essentially for free.

Taking arbitrary commits from a known non-Free repo is really not
something to be encouraged, given the odds of hitting freedom problems.
When using the latest scripts for a stable series, odds are the
procedure you suggest would work more or less reliably within that
stable series, but we've recently had an example in the 5.7-gnu series
in which it wouldn't.

> I guess you're suggesting that I should implement a radically different
> mechanism specifically for this purpose, that extracts the individual
> patches from the upstream stable git repository, attempt to apply them
> to the base Linux-libre release, compare that to the next Linux-libre
> release, and then implement my own bisection functionality.

git rebase --onto libre/vX.Y.(Z-1)-gnu nonfree/vX.Y.(Z-1) nonfree/vX.Y.Z
git diff libre/vX.Y.Z-gnu
git bisect start HEAD libre/vX.Y.(Z-1)-gnu

> If I were to implement this, what would you suggest I do if the patches
> fail to apply

Look at the conflict presented by the rebase, and resolve the likely
freedom issue introduced at that point.

> if the result fails to match the next Linux-libre release?

Identify the intervening commit where code that got cleaned up
differently was introduced or removed and make the change to the
deblobbing at that point; rinse and repeat.

If differences remain that are not caused by the patches, it's something
that changed in the scripts, possibly improving or correcting an earlier
deblobbing error, e.g., something cleaned up that was found to be Free,
or something that was missed in deblobbing, or a different way to clean
it up.  Such differences will likely be noticeable in the scripts.  It's
probably best to turn the difference in cleaning up into a separate
commit for the purposes of the bisection, just in case it is the source
of the issue being investigated.

-- 
Alexandre Oliva, happy hacker
https://FSFLA.org/blogs/lxo/
Free Software Activist
GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Linux-libre 5.8 and beyond
  2020-08-24  3:45       ` Alexandre Oliva
@ 2020-08-25  4:14         ` Mark H Weaver
  2020-08-25 11:12           ` Alexandre Oliva
  0 siblings, 1 reply; 30+ messages in thread
From: Mark H Weaver @ 2020-08-25  4:14 UTC (permalink / raw)
  To: Alexandre Oliva; +Cc: guix-devel, Jason Self

Hi Alexandre,

Alexandre Oliva <lxoliva@fsfla.org> wrote:
> On Aug 15, 2020, Mark H Weaver <mhw@netris.org> wrote:
>
>> If I were to implement this, what would you suggest I do if the patches
>> fail to apply
>
> Look at the conflict presented by the rebase, and resolve the likely
> freedom issue introduced at that point.
>
>> if the result fails to match the next Linux-libre release?
>
> Identify the intervening commit where code that got cleaned up
> differently was introduced or removed and make the change to the
> deblobbing at that point; rinse and repeat.

In other words, your proposed approach cannot be done automatically in
the general case.  Do you see how this is a problem?

If a Guix user reports that one of their devices stopped working in
Linux-libre-5.4.34, I'd like to enable them to easily build deblobbed
kernels at intermediate commits on the upstream stable/linux-5.4.y
branch.  With the present approach, I can provide a simple Guix recipe
to do this automatically.  With your proposed approach, the user may
need to manually resolve merge conflicts and so on.  Not all Guix users
will have the skills or motivation to do this.

Even _I_ would not want to do this, because it would mean doing
unnecessary labor.  I would much rather have my computer do this job
while I do something else, even if it takes longer.

       Mark

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Linux-libre 5.8 and beyond
  2020-08-25  4:14         ` Mark H Weaver
@ 2020-08-25 11:12           ` Alexandre Oliva
  0 siblings, 0 replies; 30+ messages in thread
From: Alexandre Oliva @ 2020-08-25 11:12 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guix-devel, Jason Self

Hello, Mark,

On Aug 25, 2020, Mark H Weaver <mhw@netris.org> wrote:

> Alexandre Oliva <lxoliva@fsfla.org> wrote:
>> On Aug 15, 2020, Mark H Weaver <mhw@netris.org> wrote:
>> 
>>> If I were to implement this, what would you suggest I do if the patches
>>> fail to apply
>> 
>> Look at the conflict presented by the rebase, and resolve the likely
>> freedom issue introduced at that point.
>> 
>>> if the result fails to match the next Linux-libre release?
>> 
>> Identify the intervening commit where code that got cleaned up
>> differently was introduced or removed and make the change to the
>> deblobbing at that point; rinse and repeat.

> In other words, your proposed approach cannot be done automatically in
> the general case.  Do you see how this is a problem?

Remember a few emails ago when *you* argued that the changes to the
deblobbing scripts were so infrequent that it would do no harm to use
the scripts for an earlier release, without waiting for us to check that
they work for a newer one?

How come now the very same circumstances have become so frequent as to
be a problem?

You see, the cases in which there would be patch conflicts and need for
manual resolution are those in which the cleanups made by older scripts
are no longer enough to clean up subsequent trees.

Most of these are cases in which manual intervention is required to
adjust the scripts.  But you wish to use the scripts to clean up
intervening commits that it was never tested to work on and that it may
actually fail on, leaving non-FSDG bits in place, *instead* of using a
procedure that will reliably tell you about the IYO rare cases in which
manual intervention is required.

You dismiss our automated and manual verifications, you used to use
outdated scripts, but you can't be bothered to run a simple procedure to
check that there aren't freedom issues introduced in an upstream stable
release, and to make the required manual adjustments to keep the builds
FSDG-compliant?

Heck, I'll be glad to publish, upon request, in the Linux-libre git
repo, a verified-FSDG incremental stable release branch, i.e., a branch
starting from one release and ending at a tree identical to that of a
subsequent release in the same stable branch.  Then you can point users
at that branch for bisecting within that range.

Should that be as seamless as I expect it to be, I might even start
doing that regularly, for all stable releases, as an incremental step
towards the git repo with the cleaned-up commit history of Linux
development.

> If a Guix user reports that one of their devices stopped working in
> Linux-libre-5.4.34, I'd like to enable them to easily build deblobbed
> kernels at intermediate commits on the upstream stable/linux-5.4.y
> branch.

That amounts to referring users to a non-Free source repo; that's not
desirable per the FSDG.  If we want to enable them to bisect for us, we
should offer them our own cleaned-up history.

> With the present approach, I can provide a simple Guix recipe
> to do this automatically.

... and quite prone to misuse due to unreasonable expectations that our
scripts can't live up to.

> With your proposed approach, the user may
> need to manually resolve merge conflicts and so on.  Not all Guix users
> will have the skills or motivation to do this.

*We*, not the users, should get to prepare the cleaned-up repo and fix
the conflicts, if we wish them to bisect for us.

> Even _I_ would not want to do this, because it would mean doing
> unnecessary labor.

It would be unnecessary if we had magic scripts that worked reliably no
matter what you threw at them.  This is not the case, and your
expectation that it is makes you perceive that as unnecessary labor.

That we do only part of that labor, checking only actual releases rather
than all intervening commits, might have been enough of a hint that it
takes actual labor to get what you expect to get with zero effort.

> I would much rather have my computer do this job
> while I do something else, even if it takes longer.

It can do much of it, but not all of it as you expect.  You're
unfortunately objecting to the manual labor in the very cases in which
your computer is unable to do it :-/

-- 
Alexandre Oliva, happy hacker
https://FSFLA.org/blogs/lxo/
Free Software Activist
GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Linux-libre 5.8 and beyond
  2020-08-15  6:03     ` Mark H Weaver
                         ` (2 preceding siblings ...)
  2020-08-24  3:45       ` Alexandre Oliva
@ 2020-08-24  3:58       ` Alexandre Oliva
  2020-08-24  4:12       ` Alexandre Oliva
                         ` (2 subsequent siblings)
  6 siblings, 0 replies; 30+ messages in thread
From: Alexandre Oliva @ 2020-08-24  3:58 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guix-devel, Jason Self

On Aug 15, 2020, Mark H Weaver <mhw@netris.org> wrote:

> Alexandre Oliva <lxoliva@fsfla.org> wrote:
>> On Aug 12, 2020, Mark H Weaver <mhw@netris.org> wrote:
>> 
>>>>> It may be useful for users with newer hardware devices, which are
>>>>> not yet well supported by the latest stable release, to use an
>>>>> arbitrary commit from either Linus' mainline git repository or some
>>>>> other subsystem tree.
>>>> 
>>>> The cleaning up scripts are version-specific and won't work on an
>>>> "arbitrary commit from Linus's mainline git repository" (i.e., someone
>>>> wanting to get today's most recent commit going into 5.9.) The scripts
>>>> would fall over and die in such a scenario,
>> 
>>> Okay, perhaps this was wishful thinking on my part.
>> 
>> Yup.  If you ran a deblob-check in verify mode on the resulting
>> tarballs, you'd see how error-prone this is.  You'd at least stop
>> non-Free code from silently sneaking in and finding its way into running
>> on users' machines.  That's the *least* someone who runs the
>> deblob-scripts on their own should do to smoke-test the result WRT
>> *known* freedom issues.

> What is this "verify mode" that you're referring to, and where is it
> documented?

I'm talking about the --list-blobs (default) option of deblob-check,
that tests whether an input file (source file, patch file, or tarball)
contains any suspicious patterns.  Running deblob-check with --help
prints a significant amount of documentation, though it is mostly aimed
at the internal purposes that the scripts serve.

The cleaning up scripts are not really meant to be blindly used by third
parties to clean up anything but releases they're associated with;
they're provided for documentation and transparency purposes, but
they're not even something whose existence you should count on.  E.g.,
once we realize the long-term vision of having a git repo with the
entire history, manually cleaned-up, there won't be a script to clean
things up any more, though there will surely still be something to help
us identify anything that needs cleaning up.

> The word "verify" does not occur in either of the deblob
> scripts that I know about

That's what the 'check' in deblob-check stands for.  Originally, it
would only scan for blobs.  Later on, it was extended with other actions
for use in cleaning up.

> I don't see anything like a verification mode mentioned
> in the options documented at the top of those two scripts.

Indeed, deblob-<VERSION>, which is what you use, and deblob-check, as
used by it, do not perform any verification whatsoever.  They're not
meant to.  They automate and document what we intend to clean up.  The
verifications are steps we take once we have a candidate release, that
well us whether or not it's fit for release.  If it isn't, we adjust the
scripts and start over.

You'd have to run deblob-check linux-libre-<VERSION>-guix.tar on the
cleaned up tarball to check that none of the suspicious patterns known
by deblob-check have survived in the resulting tarball.  It would have
caught the errors that Vagrant hit the other day, and it would have
reported the deblobbing errors you'd have got this week had you not
waited for the updated scripts.

Running this script in -B or -C modes is part of our development process
for new releases, and it is also one of our safety nets to stop us from
releasing non-Free Software: we run it for every release before putting
it out.

> For the record, it was not my intent to skip any automated checking
> provided by these scripts.

I understand it was not your intent, but using the scripts in
environments it wasn't tested, with upstream releases or commits it
wasn't meant for, the expectation that it will do the job you wish
without any of the verification steps we perform is misplaced.

> If we're running the scripts in a suboptimal
> way, please tell me a better way.

> FYI, right now we're simply running the main 'deblob-<VERSION>' script
> with no arguments in the unpacked Linux source directory, with the
> corresponding 'deblob-check' script in $PATH and $PYTHON pointing to
> python 2.x.  If 'deblob-<VERSION>' exits abnormally or with a non-zero
> result, the Guix build process fails.

> Last I checked, 'deblob-check' was certainly being run by
> 'deblob-<VERSION>' as a subprocess, because I had to make several
> substitutions of hard-coded paths before it would work in Guix
> (e.g. /bin/sed and /usr/bin/python).

The expected use of the scripts, for people who wish to verify that our
releases have been cleaned up as specified in the scripts, is to do just
what you do, and then compare the resulting source tree with that of our
release.  If they match, you know we haven't sneaked in any unintended
changes.  If they don't, something went wrong on either end.

Given our amount of experience and automation in the release and
verification processes, that scan the resulting source tree and also
compare the changes with those made by an earlier known good recent
release, a platform-specific bug in the underlying tools, an unexpected
change to regexp engines (as in some recent version of python3), the use
of mismatched scripts are more likely sources of differences than our
failing to notice an unexpected change on our ends.

Now, if you wanted to use the scripts for purposes other than
verification, e.g., to clean up releases before we check them, or even
after we put them out but without any attempt to verify that the result
you get is indeed what we put out, you should take responsibility for
verifying the releases at least as much as we do, otherwise any freedom
issues arising from your not catching a problem we would have caught
would unfairly reflect negatively on our project.

-- 
Alexandre Oliva, happy hacker
https://FSFLA.org/blogs/lxo/
Free Software Activist
GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Linux-libre 5.8 and beyond
  2020-08-15  6:03     ` Mark H Weaver
                         ` (3 preceding siblings ...)
  2020-08-24  3:58       ` Alexandre Oliva
@ 2020-08-24  4:12       ` Alexandre Oliva
  2020-08-24  4:34       ` Alexandre Oliva
  2020-08-24  4:42       ` Alexandre Oliva
  6 siblings, 0 replies; 30+ messages in thread
From: Alexandre Oliva @ 2020-08-24  4:12 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guix-devel, Jason Self

On Aug 15, 2020, Mark H Weaver <mhw@netris.org> wrote:

> I only checked your claims regarding 5.4, and found that you're mistaken
> about them being updated in 5.4.44.

There was a change to scripts at 5.4.44, just not one you cared about,
because you didn't use the (discontinued) deblob-main script to prepare
a cleaned-up source tarball.

> Moreover, of the 4 deblob updates (.14, .18, .27, and .34) that have
> *actually* been made so far during the 5.4.x series, IIUC only one of
> them declared new blobs to remove, namely the update for 5.4.27.

That's missing the point.  Nearly all of these changes were motivated by
changes reported as suspicious in our verification.  Some turned out to
be false positives, but they might as well have been new blobs.  Any
change has a potential to introduce new blobs, and the fact that our
verification catches suspicious changes that you'd have quietly
published as Free Software is the risk you're passing on to your users
instead of living up to the expectation that you're doing your best to
ensure they're not getting any non-Free Software from you.  The value we
provide, of checking that for every release, you're throwing on the
floor.

Yes, the releases that would *actually* introduce undesirable changes,
vs merely suspicious ones that turn out to be false positives, are a
smaller fraction of the total.  But what you're doing right now is
driving with blinders on because then you can go faster, because history
has shown there's only a 5% or 2% chance of hitting a bus.

> The 5.4.14 update only removed extraneous backslashes in existing
> regexps, changing "\e" to "e" and "\@" to "@".

That was in response to a change in python3.7 (?) regexp engine.
Fortunately, all one got from the extraneous backslashes were warnings.
But it could have been an actual change in output, or a failure to match
a pattern that ought to have been cleaned up, and since you don't
compare with our releases, you could have got non-Free results as much
as from a newly-introduced bit from upstream.

> I don't know whether these extraneous backslashes caused blobs to be
> included in the linux-libre tarballs, but if so, that presumably
> already happened in 5.4.13 and would have happened even if we had used
> your official tarballs, no?

No.  If we'd hit it ourselves, our release engineering procedures would
have caught the unexpected change.  That's why treating our scripts
(rather than our releases) as the ultimate truth, is error prone: the
underlying tools are complex and subject to change and bugs.

If you don't verify that their output isn't garbage (by comparing with
our manually verified releases, or by performing equivalent automated
and manual checks), you may end up shipping that garbage.  Odds are that
you already have.

> The 5.4.18 and 5.4.34 updates only added new 'accept' directives.  I
> guess that means that temporarily omitting these additions wouldn't
> cause new blobs to be included, is that right?

You're probably right for these instances, but it does not necessarily
follow that script changes that only add 'accept' patterns wouldn't get
you in trouble without them.  At times, we've had to add accept
statements to match newly-added occurrences of '.firmware' in such
constructs as:

struct foo var = {
       .whatever = value,
       .firmware = "filename",
       ...
};

These initializers are regarded as suspicious, so they need to be
manually marked as accepted, whether or not the filename turns out to be
a blob name that we clean up.

Without arranging for a newly-introduced '.firmware' initializer to be
accepted, this may end up cleaned up into:

struct foo var = {
       .whatever = value,
       /*(DEBLOBBED)*/ "/*(DEBLOBBED)*/",
       ...
};

which will get you a successful cleaning up session (say, if the
firmware name was already known, in a file that we already cleaned up),
and even a successful compilation, but, depending on the order of the
fields in struct foo, the cleaned-up firmware name may end up used to
initialize the wrong field.

>>> I know this because I always check for updates to the deblob scripts
>>> whenever I update linux-libre in Guix.  In practice, the deblob scripts used by
>>> Guix are never more than 1 or 2 micro versions behind the version of
>>> Linux they are applied to.
>> 
>> There have been 61 script updates for the 1274 4.*.*-gnu* and 5.*.*-gnu*
>> stable releases, so Guix has shipped potentially non-FSDG code, that
>> *would* have been flagged by deblob-check on the tarballs, at between 5%
>> and 10% of these releases.  Does that sound like a good standard for a
>> freedom-first distro to aim for?

> If it were true that we've been including blobs in 5-10% of our
> linux-libre releases, I agree that would be a serious problem.

Not what I meant, FWIW.  What I meant was that in 5-10% of the times you
might have *known* you had something wrong in your cleaned up tree if
you'd just run deblob-check on it for one of the automated
verifications.

> I already wrote about 5.4 above.  If we include only the deblob updates
> that added checks for new blobs, it's only happened once in 58 upstream
> updates, i.e. for 1.7% of the updates.

The statistics you're using, counting only the suspicious changes that
were not false positives, is analogous to saying that jaywalking, or
driving across a red light, without even looking, are acceptable as long
as you don't get hit or caught.

Getting lucky 90%, 95% or even 98% of the time doesn't make up for
disregarding the procedures that would have warned you of avoidable
issues, whether or not they turn out to be actual freedom issues.

The other reason you got much lower results than me was that I made room
for your recipe's lagging for up to 2 releases (thus the 5% of stable
releases requiring deblobbing changes turn to 10%), as you'd said, while
you seem to have done yours assuming they'd lag for at most 1.

-- 
Alexandre Oliva, happy hacker
https://FSFLA.org/blogs/lxo/
Free Software Activist
GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Linux-libre 5.8 and beyond
  2020-08-15  6:03     ` Mark H Weaver
                         ` (4 preceding siblings ...)
  2020-08-24  4:12       ` Alexandre Oliva
@ 2020-08-24  4:34       ` Alexandre Oliva
  2020-08-24  4:42       ` Alexandre Oliva
  6 siblings, 0 replies; 30+ messages in thread
From: Alexandre Oliva @ 2020-08-24  4:34 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guix-devel, Jason Self

On Aug 15, 2020, Mark H Weaver <mhw@netris.org> wrote:

> Alexandre Oliva <lxoliva@fsfla.org> wrote:
>> On Aug 12, 2020, Mark H Weaver <mhw@netris.org> wrote:

>>> I also consider it unwise for all of us, as a matter of habit or policy,
>>> to trust the integrity of the computer systems used by the Linux-libre
>>> project to perform the deblobbing.
>> 
>> I welcome double-checking of our cleaning up at all levels, but why are
>> you setting a higher trust standard for us than for a project known to
>> be at odds with our shared goals, such as Linux?

> I don't understand how you reached the conclusion that I'm setting a
> higher trust standard for Linux-libre than for Linux.

You blindly trust Linux release tags, but not ours.

OTOH, you're right that it's not a strictly higher standard.  You also
trust our cleanup scripts, even when we tell you they're not fit for the
use cases you put them through.

> The principle I'm following here is simply to avoid relying on the
> integrity of any system if I can easily avoid it.

You could avoid relying on the integrity of Linux release tags, and
trust ours instead.  That's what tells me you don't trust Linux-libre as
much as you do Linux.

You could use our tags, at the very least to check that you got
something sensible out of your own deblobbing run, but you don't even
look at them.  You're not checking anything, so what you put builders
through is at best busy, redundant work, and at worst, a waste of cpu
cycles that doesn't even get them what they hope for.

> However, I reject the argument that because we must
> trust X and Y, we might as well trust Z as well.

That doesn't follow indeed.

What I'm saying was that, instead of trusting both X and Y, you might
trust just X instead, while you insisted on trusting mostly just Y
instead (but also X and a bunch of other tools used underneath).

>> But the point stands that, for someone who'd rather trust no one, you're
>> blindly trusting both Linux and Linux-libre.  The former when it comes
>> to base releases you don't check; the latter when it comes to scripts
>> whose results you hardly even look at.  Why not reduce your trust base
>> to just Linux-libre,

> That's not possible.  Clearly, you do not have the capacity to audit all
> of the code that Linux produces.  Therefore, by trusting Linux-libre, we
> must implicitly also trust the Linux project.  That much we cannot
> avoid.  We also cannot avoid trusting your deblob scripts.

True, we don't even attempt to audit Linux sources in this sense.  This
seems to imply that taking our cleaned-up sources, and taking Linux'
sources and cleaning them up, carries exactly the same amount of trust
on each project involved.  And yet you prefer to trust the one that
sneaks non-FSDG bits in every now and again, instead of the one that
hunts them down and removes them.

> However, we *can* easily avoid trusting the integrity of the systems
> that you use to run the deblob scripts.

You *could* avoid that, and also some blind trust on the underlying
tools and systems used for cleaning up by us and by you, by at least
*comparing* the cleaned-up tree you get with the one we provide.  But
that's not what you do.  You distrust us enough to shed doubts on our
processes, but you (and guix builders, trusting you) trust us enough to
run our scripts for purposes they aren't fit, and trust a very complex
and fragile combination of tools and systems to carry out its difficult
job without giving their output a second look.

> In fact, I strongly support reducing Guix's reliance on pre-generated
> outputs produced by *any* project.  I'm not singling out the Linux-libre
> project here.

You really are.  You take most other projects' releases without anything
even close to the amount of scrutiny and disregard that you place on the
results of our release engineering processes and resulting release
tarballs and tags.  You might not think so if you consider the
deblobbing scripts we publish for transparency and verification as our
releases, but since they (very) occasionally remain unchanged even when
new changes need to be made (*), say because they by chance already
contain the code that makes the newly-needed changes, that supposed
equivalence is a mistake.

(*) just as I write this, I manually check 5.8.3-gnu and find a fresh
example of this, also applicable to 5.7.17-gnu and 5.4.60-gnu.

-- 
Alexandre Oliva, happy hacker
https://FSFLA.org/blogs/lxo/
Free Software Activist
GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Linux-libre 5.8 and beyond
  2020-08-15  6:03     ` Mark H Weaver
                         ` (5 preceding siblings ...)
  2020-08-24  4:34       ` Alexandre Oliva
@ 2020-08-24  4:42       ` Alexandre Oliva
  6 siblings, 0 replies; 30+ messages in thread
From: Alexandre Oliva @ 2020-08-24  4:42 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guix-devel, Jason Self

Hello, Mark,

On Aug 15, 2020, Mark H Weaver <mhw@netris.org> wrote:

> I was talking about my hope to enable users, *on their own
> machines* and using *their own private build recipes*, to make a
> best-effort deblobbing of a non-standard kernel variant that they need
> to use for whatever reason.

A non-free kernel, standard or not, shouldn't really be in scope for a
FSDG distro, IMHO.  Even the pointer to the non-Free releases used as a
starting point for build recipes comes across as undesirable to me, more
so when there's an expectation (and such a high concern) for enabling
users to use them, with a near-certainty that this will likely go
silently wrong freedom-wise.

> If they aren't provided with that option,
> the obvious alternative (which I expect 99% of such users would do
> anyway) is to simply run a fully-blobbed kernel instead.

I'm surprised that they'd prefer to run deblobbing and checking at each
point of a bisection, over applying the deblobbing changes as a patch,
or even starting from a Free release, rebasing a set of changes to test
onto it, and quickly building and bisecting that.  That's what I would
rather do.  But then, I probably wouldn't be using the guix build recipe
and default kernel config for the bisection, but rather a smaller config
built within the bisect tree.

> Alexandre Oliva <lxoliva@fsfla.org> wrote:

>> I'm sure that's not what you intend, but this arrangement, plus your
>> mention of hurriedly getting releases out, adds up to an incentive to
>> disable the deblobbing so as to get a faster build.

> I don't understand how you reached this conclusion.  As far as I can
> tell, changing Guix to run the deblob scripts made *no* difference to
> what someone would have to do to ask Guix to build fully-blobbed
> kernel.

One of the issues, as you'd pointed out, was time pressure to get a
build completed.  If someone is under such pressure, and knows that
deblobbing will take 30 minutes, and that verifying the deblobbed tree
will take another 30 minutes (or 24 hours, if using the wrong tool for
the job), one might disable the cleaning up rather than figuring out how
to get the recipe to use an already cleaned and verified release.

> In particular, if I can easily run an
> automated process on my own machine instead of relying on some other
> system to provide pre-generated outputs for me, then I prefer to do it
> myself.

That's at odds with the time pressure you mentioned before.

Now, let me get something straight.  You seem to have got the idea that
I oppose verification of our releases.  That's very very wrong.  I
welcome verification.  I just don't see that this belongs in the guix
build process.

I get it that guix packages several projects that need cleaning up.
IMHO, guix build recipes should NOT point at such upstream projects
along with the cleaning up recipes.  This should be part of a separate
recipe, namely, that of packaging/verifying/blessing *sources* for use
in guix.  Once the sources are packaged (in a verifiable/reproducible
way), they should be made available by the distro to users.  These are
the corresponding sources that we expect every distro to offer.

It's not just about builders getting those sources, verifying them (or
not) and making binaries out of them.  Any user ought to be entitled to
request corresponding sources to binaries provided by guix, and guix
should be able to provide them without requiring users to run
potentially complex procedures that might even end up producing
different results, depending on platform-specific bugs, versions of
tools, not to mention the various other potential sources of
non-reproducible sources and binaries.  

Even if the procedures are meant to be reproducible, you'll only know
they aren't when you manage to trace a difference in a packaged binary
back to a difference in sources, when you can no longer reproduce the
sources used before.  Archiving the sources proper, for verification and
for distribution to users as corresponding sources, would avoid
surprises of non-reproducible procedures being found out long after the
fact, just when corresponding sources are requested and can't be
provided any more.

I'm ambivalent as to whether patches that guix wishes to apply should be
applied as part of source packaging, or have the patches made available
separately.  I can see arguments both ways.  On the one hand, applying
patches, as reproducible as it normally is, might be subject to
occasional variations, especially when the line numbers or the contexts
in the patch are inexact.  On the other, these cases are extremely rare,
and being able to reuse a base tarball while trying out some patches,
without having to repackage a base tarball, and having patches
conspicuously presented to builders and users, separately from an
upstream base release, is desirable when the patches are not meant to
address freedom issues (those that do address freedom issues had better
be applied by other means during source packaging, to avoid publishing
reversible patches that could be used to reintroduce the freedom
issues).  This suggests there could be support for patches in both
source preparation recipes, and in build recipes.

For projects that need cleaning up, the source packaging recipe could
apply any needed cleaning up.  For projects like GNU Linux-libre, that
are already cleaned up, or most other packages that don't need any
cleaning up whatsoever, the source preparation recipe could be as simple
as downloading the sources, as well as any signatures thereof, checking
that they match, and recording the checksums of the sources to be used
for binary building.

Source preparation might also offer a verify mode, that would *also*
fetch the sources from a corresponding release of the project that needs
cleaning up, perform the cleaning up and compare the results, but I'd
much rather links to the corresponding projects that need cleaning up be
pushed out of FSDG-compliant distros.  Maintainers of such packages
could and probably should run such verification themselves, without
exposing every builder to the non-Free pointers and code.

I urge guix to address the problem of build recipes pointing to non-Free
packages and getting builders to download non-Free Software onto their
machines.

It would probably be wise to discuss more broadly how FSDG distros can
document and share their cleaning up procedures, so that builders and
users can double-check them if they wish to, and so that other FSDG
distros can cooperate and reuse.  Clearly we don't wish distro
maintainers to keep these private to themselves, but we surely don't
want links to sources containing unacceptable sources to be conspicuous
in the distro either, let alone being used when Free sources are or
could be readily available.

Now, I wonder...  If sources for projects other than Linux-libre and
GNUzilla need cleaning up, perhaps it would make sense for our
community, e.g. GNU, to undertake the source cleaning up, releasing
clean sources for all interested distros and users to get to.  Perhaps
we could encourage maintainers of the such packages in the various Free
distros to share and divide the workload of maintaining them and the
cleaning up recipes, for everyone's benefit.  Then guix could just point
at the clean sources released by this project, instead of going through
the significant change of introducing separate 'prepare sources' recipes
to avoid pointing users at non-Free sources through build recipes.

-- 
Alexandre Oliva, happy hacker
https://FSFLA.org/blogs/lxo/
Free Software Activist
GNU Toolchain Engineer

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Linux-libre 5.8 and beyond
@ 2020-08-11  4:07 Mark H Weaver
  0 siblings, 0 replies; 30+ messages in thread
From: Mark H Weaver @ 2020-08-11  4:07 UTC (permalink / raw)
  To: Bengt Richter; +Cc: guix-devel, Vagrant Cascadian, Marius Bakke

Bengt Richter <bokr@bokr.com> wrote:
> BTW, how did nix get such a weird alphabet for 0-31 ?

My guess is that the weird alphabet was chosen to avoid some of the most
common letters in English text, so that when scanning build outputs for
embedded hashes, one is less likely to mistake something else (e.g. text
or some other base32/base64 encoding) as a Nix hash.  The omitted
letters in Nix hashes are (e t o u), whereas (e t a o) are the most
common letters in English text.  I'm not sure why they chose to omit 'u'
though, given that it's quite far down the list of most common English
letters.

    Regards,
      Mark

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Linux-libre 5.8 and beyond
@ 2020-08-08 20:57 Vagrant Cascadian
  2020-08-09  0:02 ` Mark H Weaver
                   ` (2 more replies)
  0 siblings, 3 replies; 30+ messages in thread
From: Vagrant Cascadian @ 2020-08-08 20:57 UTC (permalink / raw)
  To: guix-devel; +Cc: Marius Bakke

[-- Attachment #1: Type: text/plain, Size: 1640 bytes --]

Thanks for updating linux-libre to 5.7!

I saw the 5.8 was out, and gave a quick shot at updating it, but it hung
python indefinitely during the deblobbing process. I also tried
switching to python 3 instead of python 2, but it had the same
issue. Apparently this is a known issue:

  https://lists.gnu.org/archive/html/info-gnu/2020-08/msg00001.html

When I tried switching deblob to use gawk instead, it produced the
cleaned tarball, but resulted in syntax errors when building
linux-libre.

So I asked a bit in #linux-libre on freenode and they wondered why we
don't use the git repository instead of running the deblob scripts again
in guix.

One of the issues might be that linux-libre may occasionally remove
releases that accidentally contained non-free code breaking guix's
ability to build old versions. Not sure exactly where guix's balance
between functional package management and software freedom interplays
there.

That said, using their git repository could allow guix to take advantage
of the software heritage as a fallback; though I'm not quite sure how
well that would work with removed versions.

Downloading the git repository of a project as large as linux-libre
every time is probably somewhat expensive. Though the process of
deblobbing in guix is also quite expensive...

There's more debugging to do (and admittedly, wrapping my head around
the deblobbing code in linux.scm is a bit difficult) and the linux-libre
folks are somewhat interested to figure out what exactly is wrong with
the process building on guix.

Not sure how much time I can throw at it, but curious what others think!

live well,
  vagrant

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 227 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Linux-libre 5.8 and beyond
  2020-08-08 20:57 Vagrant Cascadian
@ 2020-08-09  0:02 ` Mark H Weaver
  2020-08-09  3:30 ` Mark H Weaver
  2020-08-25 21:01 ` Leo Famulari
  2 siblings, 0 replies; 30+ messages in thread
From: Mark H Weaver @ 2020-08-09  0:02 UTC (permalink / raw)
  To: Vagrant Cascadian; +Cc: guix-devel, Marius Bakke

Hi,

Vagrant Cascadian <vagrant@debian.org> wrote:
> Thanks for updating linux-libre to 5.7!

Yes, many thanks to Leo Famulari for taking care of that (large) job.

> I saw the 5.8 was out, and gave a quick shot at updating it, but it hung
> python indefinitely during the deblobbing process. I also tried
> switching to python 3 instead of python 2, but it had the same
> issue. Apparently this is a known issue:
> 
>   https://lists.gnu.org/archive/html/info-gnu/2020-08/msg00001.html

Thanks for bringing this to our attention.  Until the deblobbing issue
is resolved, in the definition of 'linux-libre-5.8-pristine-source', we
could simply replace the call to 'make-linux-libre-source' with an
ordinary 'origin' form that fetches the deblobbed source tarball from
the linux-libre project, using (linux-libre-urls linux-libre-5.8-version)
as the URI.

The bigger issue is that the default configurations will need to be
updated again before 5.7.x reaches end-of-life, which will be quite
soon.  Otherwise we'll need to revert back to 5.4.x in order to get
upstream security updates.

> So I asked a bit in #linux-libre on freenode and they wondered why we
> don't use the git repository instead of running the deblob scripts again
> in guix.
> 
> One of the issues might be that linux-libre may occasionally remove
> releases that accidentally contained non-free code breaking guix's
> ability to build old versions.

Last I checked, the linux-libre project periodically deletes most of its
older tarballs, even if there are no accidents.  This problem came to my
attention while trying to help someone determine which version of
linux-libre introduced a bug on their system.  I was about to suggest
bisecting point versions before realizing that the relevant linux-libre
tarballs had all been deleted.  Moreover, if we had succeeded in finding
the first buggy release, the next step would have been to do a 'git
bisect' to determine the precise commit that introduced the bug.

Other reasons to run the deblob scripts ourselves include:

* It may be useful for users with newer hardware devices, which are not
  yet well supported by the latest stable release, to use an arbitrary
  commit from either Linus' mainline git repository or some other
  subsystem tree.

* It allows us to update to a new point version (which usually includes
  security fixes) more quickly, before the linux-libre project reacts.

* It allows us to avoid trusting the integrity of the systems used by
  the linux-libre project to produce their deblobbed tarballs.

     Regards,
       Mark

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Linux-libre 5.8 and beyond
  2020-08-08 20:57 Vagrant Cascadian
  2020-08-09  0:02 ` Mark H Weaver
@ 2020-08-09  3:30 ` Mark H Weaver
  2020-08-09  3:43   ` Vagrant Cascadian
  2020-08-09 18:49   ` Leo Famulari
  2020-08-25 21:01 ` Leo Famulari
  2 siblings, 2 replies; 30+ messages in thread
From: Mark H Weaver @ 2020-08-09  3:30 UTC (permalink / raw)
  To: Vagrant Cascadian; +Cc: guix-devel, Marius Bakke

Hi Vagrant,

Vagrant Cascadian <vagrant@debian.org> wrote:
> I saw the 5.8 was out, and gave a quick shot at updating it, but it hung
> python indefinitely during the deblobbing process.

I was unable to reproduce this problem.  I simply added version 5.8 in
the usual way, without changing the deblobbing code at all, and the
deblobbing process worked correctly on my Thinkpad X200 (x86_64-linux).

I pushed commit cb97d076491495aa956dbff93679a51cc5708010 to 'master',
which adds the linux-libre@5.8 source and headers packages.  You should
be able to build the deblobbed and patched source with the following
command:

  guix build -S -e '(@ (gnu packages linux) linux-libre-5.8-source)'

Does it work for you?  If so, how does it differ from what you tried
before?

Note that the default kernel configurations for 5.8 still need to be
added.  Leo, would you like to work on that?

     Regards,
       Mark

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Linux-libre 5.8 and beyond
  2020-08-09  3:30 ` Mark H Weaver
@ 2020-08-09  3:43   ` Vagrant Cascadian
  2020-08-09 18:09     ` Vagrant Cascadian
  2020-08-09 18:49   ` Leo Famulari
  1 sibling, 1 reply; 30+ messages in thread
From: Vagrant Cascadian @ 2020-08-09  3:43 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guix-devel, Marius Bakke

On 2020-08-08, Mark H. Weaver wrote:
> Vagrant Cascadian <vagrant@debian.org> wrote:
>> I saw the 5.8 was out, and gave a quick shot at updating it, but it hung
>> python indefinitely during the deblobbing process.
>
> I was unable to reproduce this problem.  I simply added version 5.8 in
> the usual way, without changing the deblobbing code at all, and the
> deblobbing process worked correctly on my Thinkpad X200 (x86_64-linux).

Curious. At a quick glance it looks like the same hashes for linux and
the deblob and scripts that I used. I encountered the issue both on a
pinebook pro (aarch64) and a ~6 year old i5 (x86_64) laptop, which I
figured were different enough that it was a problem in the code...


> I pushed commit cb97d076491495aa956dbff93679a51cc5708010 to 'master',
> which adds the linux-libre@5.8 source and headers packages.  You should
> be able to build the deblobbed and patched source with the following
> command:
>
>   guix build -S -e '(@ (gnu packages linux) linux-libre-5.8-source)'
>
> Does it work for you?  If so, how does it differ from what you tried
> before?

Will try to test again sometime in the coming days.


live well,
  vagrant


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Linux-libre 5.8 and beyond
  2020-08-09  3:43   ` Vagrant Cascadian
@ 2020-08-09 18:09     ` Vagrant Cascadian
  2020-08-09 22:17       ` Mark H Weaver
  0 siblings, 1 reply; 30+ messages in thread
From: Vagrant Cascadian @ 2020-08-09 18:09 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: guix-devel, Marius Bakke

[-- Attachment #1: Type: text/plain, Size: 1766 bytes --]

On 2020-08-08, Vagrant Cascadian wrote:
> On 2020-08-08, Mark H. Weaver wrote:
>> Vagrant Cascadian <vagrant@debian.org> wrote:
>>> I saw the 5.8 was out, and gave a quick shot at updating it, but it hung
>>> python indefinitely during the deblobbing process.
>>
>> I was unable to reproduce this problem.  I simply added version 5.8 in
>> the usual way, without changing the deblobbing code at all, and the
>> deblobbing process worked correctly on my Thinkpad X200 (x86_64-linux).
>
> Curious. At a quick glance it looks like the same hashes for linux and
> the deblob and scripts that I used.

At a longer glance, it looks like I failed to update the hashes
correctly. The hashes for deblob-check 5.7 and deblob-check 5.8 both
began with "1n" and I must have somehow glazed over the differences and
not updated the local commit.

How guix actually managed to download deblob-check 5.8 from a different
URL and proceed to attempt to use the "old" store item without noticing
the hash was different still remains a mystery to me... I would have
expected it to error out before getting that far.


> I encountered the issue both on a
> pinebook pro (aarch64) and a ~6 year old i5 (x86_64) laptop, which I
> figured were different enough that it was a problem in the code...
>
>
>> I pushed commit cb97d076491495aa956dbff93679a51cc5708010 to 'master',
>> which adds the linux-libre@5.8 source and headers packages.  You should
>> be able to build the deblobbed and patched source with the following
>> command:
>>
>>   guix build -S -e '(@ (gnu packages linux) linux-libre-5.8-source)'
>>
>> Does it work for you?  If so, how does it differ from what you tried
>> before?
>
> Will try to test again sometime in the coming days.

in progress...


live well,
  vagrant

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 227 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Linux-libre 5.8 and beyond
  2020-08-09 18:09     ` Vagrant Cascadian
@ 2020-08-09 22:17       ` Mark H Weaver
  2020-08-10 22:39         ` Bengt Richter
  0 siblings, 1 reply; 30+ messages in thread
From: Mark H Weaver @ 2020-08-09 22:17 UTC (permalink / raw)
  To: Vagrant Cascadian; +Cc: guix-devel, Marius Bakke

Hi Vagrant,

Vagrant Cascadian <vagrant@debian.org> wrote:
> At a longer glance, it looks like I failed to update the hashes
> correctly. The hashes for deblob-check 5.7 and deblob-check 5.8 both
> began with "1n" and I must have somehow glazed over the differences and
> not updated the local commit.

Ah, okay, that makes sense.  I guess you accidentally used version 5.7
of deblob-check on the 5.8 kernel.

Note that although base32 encodes 5 bits per character, the first
character of a base32-encoded sha256 hash can only be 0 or 1, since
there's only 1 bit remaining to encode after the other 255 bits have
been encoded in the last 51 characters.

> How guix actually managed to download deblob-check 5.8 from a different
> URL and proceed to attempt to use the "old" store item without noticing
> the hash was different still remains a mystery to me... I would have
> expected it to error out before getting that far.

If the file name and hash matches a previously downloaded file in your
store, the guix daemon uses that one and skips the download, regardless
of the URL.  That's why no error was reported.  There's no version
number in the file name of the 'deblob-check' file.

    Thanks!
      Mark

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Linux-libre 5.8 and beyond
  2020-08-09 22:17       ` Mark H Weaver
@ 2020-08-10 22:39         ` Bengt Richter
  2020-08-11  2:37           ` Tobias Geerinckx-Rice
  0 siblings, 1 reply; 30+ messages in thread
From: Bengt Richter @ 2020-08-10 22:39 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: Vagrant Cascadian, guix-devel, Marius Bakke


On +2020-08-09 18:17:48 -0400, Mark H Weaver wrote:
> 
> Note that although base32 encodes 5 bits per character, the first
> character of a base32-encoded sha256 hash can only be 0 or 1, since
> there's only 1 bit remaining to encode after the other 255 bits have
> been encoded in the last 51 characters.
> 
UIAM, that's only true for the nix flavor (which is default for guix hash, I think)
of base32. Again UIAM, the nix view of a 256-bit sha256sum hash is little-endian,
and shifts 5 bits out the bottom, as if with euclidean/ 32, and so winds up with
the 1 or 0 last, at the top.

I think all the others base32's shift 5 bits at a time from the big end, and
could have the full range 0-31 for the top digit, however translated to glyphs.
Which also means the last value on the right is a 1 or 0 in the top bit, valued 16 or 0.

Of course, different length digests may produce other remainder end values.

BTW, how did nix get such a weird alphabet for 0-31 ? Watermarking themselves? :)

-- 
Regards,
Bengt Richter


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Linux-libre 5.8 and beyond
  2020-08-10 22:39         ` Bengt Richter
@ 2020-08-11  2:37           ` Tobias Geerinckx-Rice
  0 siblings, 0 replies; 30+ messages in thread
From: Tobias Geerinckx-Rice @ 2020-08-11  2:37 UTC (permalink / raw)
  To: Bengt Richter; +Cc: guix-devel

[-- Attachment #1: Type: text/plain, Size: 570 bytes --]

Bengt,

Bengt Richter 写道：
> BTW, how did nix get such a weird alphabet for 0-31 ? 
> Watermarking themselves? :)

This question probably deserves a Nix FAQ entry by now, if there 
isn't one already :-)

  “This is to reduce the possibility that hash representations 
  contain character sequences that are potentially offensive to 
  someusers (a known possibility with alphanumeric representations 
  of numbers).”
    -- https://edolstra.github.io/pubs/phd-thesis.pdf

Excercises for the puerile reader are obvious.

Kind regards,

T G-R

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 227 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Linux-libre 5.8 and beyond
  2020-08-09  3:30 ` Mark H Weaver
  2020-08-09  3:43   ` Vagrant Cascadian
@ 2020-08-09 18:49   ` Leo Famulari
  1 sibling, 0 replies; 30+ messages in thread
From: Leo Famulari @ 2020-08-09 18:49 UTC (permalink / raw)
  To: Mark H Weaver; +Cc: Vagrant Cascadian, guix-devel, Marius Bakke

On Sat, Aug 08, 2020 at 11:30:40PM -0400, Mark H Weaver wrote:
> Note that the default kernel configurations for 5.8 still need to be
> added.  Leo, would you like to work on that?

I'm planning to do that when the 5.8 kernel becomes the "stable" kernel.
I assume this will happen soon.


^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Linux-libre 5.8 and beyond
  2020-08-08 20:57 Vagrant Cascadian
  2020-08-09  0:02 ` Mark H Weaver
  2020-08-09  3:30 ` Mark H Weaver
@ 2020-08-25 21:01 ` Leo Famulari
  2020-08-26  3:17   ` Leo Famulari
  2 siblings, 1 reply; 30+ messages in thread
From: Leo Famulari @ 2020-08-25 21:01 UTC (permalink / raw)
  To: guix-devel

[-- Attachment #1: Type: text/plain, Size: 609 bytes --]

Hi,

I have started handling major updates of linux-libre for Guix, starting
with version 5.7 (collaborators are invited!).

I didn't read this discussion because it's quite long and I don't
perceive that anything needs to change with how we package linux-libre.
It has worked well for several years.

If there are concrete problems to report or changes to request, please
let us know by opening a bug ticket at <bug-guix@gnu.org>, or by sending
a patch to <guix-patches@gnu.org>.

Otherwise, I don't think this conversation is very productive and I
request that it either stops or moves somewhere else.

Leo

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Linux-libre 5.8 and beyond
  2020-08-25 21:01 ` Leo Famulari
@ 2020-08-26  3:17   ` Leo Famulari
  2020-08-26 15:41     ` Katherine Cox-Buday
  0 siblings, 1 reply; 30+ messages in thread
From: Leo Famulari @ 2020-08-26  3:17 UTC (permalink / raw)
  To: guix-devel

[-- Attachment #1: Type: text/plain, Size: 1189 bytes --]

On Tue, Aug 25, 2020 at 05:01:07PM -0400, Leo Famulari wrote:
> If there are concrete problems to report or changes to request, please
> let us know by opening a bug ticket at <bug-guix@gnu.org>, or by sending
> a patch to <guix-patches@gnu.org>.

I'd like to explain more clearly what I meant by my last message.

First, it's important to remember that we share a common goal: the
creation of freely licensed computing systems. The people involved in
the Guix and linux-libre projects have worked hard for years towards
this goal — most of us as volunteers.

For many of us, it's natural to identify with our work, and when people
suggest changes or offer criticism that we think is incorrect or misses
the mark somehow, it's also natural to have negative feelings in
response.

But, we must remember that the other party may not understand the
context of their suggestion deeply enough to know why it should not be
implemented. There are technical *and* social contexts at play, and they
are rarely understood until one has been working on the project for
quite a while.

We should be generous and charitable when interpreting each other's
messages and goals.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: Linux-libre 5.8 and beyond
  2020-08-26  3:17   ` Leo Famulari
@ 2020-08-26 15:41     ` Katherine Cox-Buday
  0 siblings, 0 replies; 30+ messages in thread
From: Katherine Cox-Buday @ 2020-08-26 15:41 UTC (permalink / raw)
  To: Leo Famulari; +Cc: guix-devel

Leo Famulari <leo@famulari.name> writes:

> On Tue, Aug 25, 2020 at 05:01:07PM -0400, Leo Famulari wrote:
> But, we must remember that the other party may not understand the
> context of their suggestion deeply enough to know why it should not be
> implemented. There are technical *and* social contexts at play, and they
> are rarely understood until one has been working on the project for
> quite a while.
>
> We should be generous and charitable when interpreting each other's
> messages and goals.

What a great outlook from a maintainer of an open source project. I have
read nothing else in this thread, but this is wonderful.

-- 
Katherine


^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2020-08-26 15:42 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-08-09 20:15 Linux-libre 5.8 and beyond Jason Self
2020-08-13  0:39 ` Mark H Weaver
2020-08-13 16:47   ` Linux-libre git repository Vagrant Cascadian
2020-08-14  0:03     ` Jason Self
2020-08-14 14:03     ` Danny Milosavljevic
2020-08-14 13:47   ` Linux-libre 5.8 and beyond Alexandre Oliva
2020-08-15  6:03     ` Mark H Weaver
2020-08-16  1:24       ` Mark H Weaver
2020-08-16 12:43         ` Jason Self
2020-08-16 10:54       ` Jason Self
2020-08-24  3:45       ` Alexandre Oliva
2020-08-25  4:14         ` Mark H Weaver
2020-08-25 11:12           ` Alexandre Oliva
2020-08-24  3:58       ` Alexandre Oliva
2020-08-24  4:12       ` Alexandre Oliva
2020-08-24  4:34       ` Alexandre Oliva
2020-08-24  4:42       ` Alexandre Oliva
  -- strict thread matches above, loose matches on Subject: below --
2020-08-11  4:07 Mark H Weaver
2020-08-08 20:57 Vagrant Cascadian
2020-08-09  0:02 ` Mark H Weaver
2020-08-09  3:30 ` Mark H Weaver
2020-08-09  3:43   ` Vagrant Cascadian
2020-08-09 18:09     ` Vagrant Cascadian
2020-08-09 22:17       ` Mark H Weaver
2020-08-10 22:39         ` Bengt Richter
2020-08-11  2:37           ` Tobias Geerinckx-Rice
2020-08-09 18:49   ` Leo Famulari
2020-08-25 21:01 ` Leo Famulari
2020-08-26  3:17   ` Leo Famulari
2020-08-26 15:41     ` Katherine Cox-Buday

Code repositories for project(s) associated with this external index

	https://git.savannah.gnu.org/cgit/guix.git

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.