* ZFS on Guix, again
@ 2021-02-20 8:48 raid5atemyhomework
2021-02-20 11:44 ` 宋文武
2021-02-22 8:57 ` Ludovic Courtès
0 siblings, 2 replies; 6+ messages in thread
From: raid5atemyhomework @ 2021-02-20 8:48 UTC (permalink / raw)
To: guix-devel@gnu.org, Ludovic Courtès
Hi guix-devel,
I had some questions on the big ZFS guix bugpatches a week ago, and did not find any response, so I am back here pestering everyone.
Anyway:
* I am wary of calling the service type that accepts kernel modules as `linux-loadable-module-service-type`:
* The equivalent existing `operating-system` field is `kernel-loadable-modules`. Because `operating-system` is user-facing, we cannot rename it to `linux-loadable-modules`, thus leading to a naming inconsistency (the `operating-system` field is `kernel-loadable-modules`, the service type that adds items to that field is `linux-loadable-module-service-type`).
* Just because Guix only supports two kernels *now* and only one of them (`linux-libre`) supports loadable modules does not mean in the future Guix will not support *other* kernels with a concept of a loadable module (e.g. FreeBSD kernel). So it seems premature to only name it `linux` loadable modules when the concept of a kernel-loadable module apparently also exists in other kernels (since OpenZFS itself can be compiled as an out-of-tree kernel module for FreeBSD, and there is nothing really preventing Guix from supporting FreeBSD in the future).
* There is already an existing `kernel-module-loader-service-type`. This is used to explicitly load kernel modules, which either have to be in the `kernel-loadable-modules` field of `operating-system`, or provided by extending with the new, inconsistently named `linux-loadable-module-service-type`.
Changing the name to `linux-loadable-module-service-type` means:
* We should deprecate `kernel-module-loader-service-type` and replace it with an equivalent `linux-module-loader-service-type`.
* We should deprecate the `operating-system` `kernel-loadable-modules` field and replace it with an equivalent `linux-loadable-modules` field.
In any case, I have some sketches below.
I want to create two new service types:
* `linux-profile-builder-service-type` which has configuration `linux-profile-builder-configuration`.
* `linux-profile-builder-configuration` has fields:
* `linux-libre` which is the `kernel` field of the `operating-system`.
* `loadable-modules` which is the `kernel-loadable-modules` field of the `operating-system`.
* This type is extensible. `compose` is `identity`, `extend` is `(lambda (config extensions) ((apply compose identity extensions) config))`
* In short, extensions of this service-type should return a procedure which takes the `linux-profile-builder-configuration` and modifies it.
* This extend the root `system-service-type`, creating the `kernel` output.
* `linux-loadable-module-service-type`, which takes as configuration an empty list.
* This type is extensible. `compose` is `concatenate`, `extend` is `append`.
* This has a single service extension:
* Extends `linux-profile-builder-service-type` and if the configuration is not an empty list, extends the Linux-libre profile builder by a procedure that appends the list of kernel-loadable modules.
The above gives a separation of concepts:
* The `linux-profile-builder` builds the kernel profile for Linux-libre systems.
* The `linux-loadable-module-service-type` ensures that the kernel profile contains particular loadable kernel modules.
In the future there may be additional non-module things we can add to the Linux profile, so I think this separation is useful.
--
Another point I want to bring up is the use of `file-system-service-type`.
If we use `file-system-service-type` to extends the `file-systems` Shepherd service, then we need to add some kind of field to exempt the ZFS service from being added to `/etc/fstab`.
Note that ZFS expects there to be dozens of filesystems, and that creating and destroying file systems is just a "simple" `zfs create pool/file/system` command. Each possible use or application may need to have specific tuning, thus each application may very well have its own file system with its own ZFS parameters specifically tuned for that application.
This is not a good fit with the `operating-system` mechanism in Guix, where you have to reconfigure the entire system just to add or remove file systems.
Nevertheless the ZFS still supports manual filesystem management, you just need to specify the `legacy` parameter, so it's still possible to use `operating-system` and its `file-systems` field to manage ZFS mounts, you just need to do `zfs create -o mountpoint=legacy pool/file/system`. Though Guix still needs some modifications since the `device` would have to be `"pool/file/system"` and some parts of Guix attempt to search for a block device.
However, for the case where the user expects the "typical" ZFS style of managing file systems, we need to mount all the ZFS file systems and ensure that they aer all already mounted by the time `file-systems` Shepherd service is started. This means we need to be able to extend the `requirement` of the `file-systems` Shepherd service. And we need to do that without putting any extra `/etc/fstab` entries since for "typical" ZFS style of managing file systems, they are required to ***not*** be put in `/etc/fstab`.
* We can just create a separate `file-systems-target-service-type` that always accepts (list of) symbols that the `file-systems` Shepherd service will `requirement`. Then `file-systems-service-type` can just extend that service type. This is what I already originally did.
* 宋文武 proposed to instead make `file-systems-service-type` accept a heteregonous list of either symbols or `<file-system>` records.
* Ludo' ***agreed*** with this.... but then says that mixing symbols and `<file-system>` records in the same list is bad design. So... this is confusing.
There are two alternatives:
* Go with what I already proposed which I think is more general-purpose and cleaner (there is a separate service type that accepts symbols, and a separate service type that accepts `<file-system>` records, and the latter just extends the former).
* Don't make a separate service type, but now we need to add some kind of `fstab?` field to `file-system` so that the ZFS shepherd service that mounts ZFS file systems will not be included in the `/etc/fstab`.
I think overall that having lots of tiny service types that are then combined together fits the functional design of Guix better. So I would strongly propose my original design rather than hacks on top of `file-system-service-type`.
Thanks
raid5atemyhomework
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: ZFS on Guix, again
2021-02-20 8:48 ZFS on Guix, again raid5atemyhomework
@ 2021-02-20 11:44 ` 宋文武
2021-02-22 8:57 ` Ludovic Courtès
1 sibling, 0 replies; 6+ messages in thread
From: 宋文武 @ 2021-02-20 11:44 UTC (permalink / raw)
To: raid5atemyhomework; +Cc: guix-devel@gnu.org, 45692
raid5atemyhomework <raid5atemyhomework@protonmail.com> writes:
> Hi guix-devel,
>
> I had some questions on the big ZFS guix bugpatches a week ago, and
> did not find any response, so I am back here pestering everyone.
Hello, thank you for working on ZFS for guix!
>
> [...]
> There are two alternatives:
>
> * Go with what I already proposed which I think is more general-purpose and cleaner (there is a separate service type that accepts symbols, and a separate service type that accepts `<file-system>` records, and the latter just extends the former).
> * Don't make a separate service type, but now we need to add some kind of `fstab?` field to `file-system` so that the ZFS shepherd service that mounts ZFS file systems will not be included in the `/etc/fstab`.
>
> I think overall that having lots of tiny service types that are then
> combined together fits the functional design of Guix better. So I
> would strongly propose my original design rather than hacks on top of
> `file-system-service-type`.
Well, I think the 'file-system-service-type' should handle all file
systems related configurations, but my opion is not strong. Waiting
ludo to decide...
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: ZFS on Guix, again
2021-02-20 8:48 ZFS on Guix, again raid5atemyhomework
2021-02-20 11:44 ` 宋文武
@ 2021-02-22 8:57 ` Ludovic Courtès
2021-02-23 1:11 ` raid5atemyhomework
1 sibling, 1 reply; 6+ messages in thread
From: Ludovic Courtès @ 2021-02-22 8:57 UTC (permalink / raw)
To: raid5atemyhomework; +Cc: guix-devel@gnu.org
Hi,
Sorry for the delay; this isn’t as simple as it looks!
I agree with 宋文武 regarding ‘file-system-service-type’.
raid5atemyhomework <raid5atemyhomework@protonmail.com> skribis:
> However, for the case where the user expects the "typical" ZFS style of managing file systems, we need to mount all the ZFS file systems and ensure that they aer all already mounted by the time `file-systems` Shepherd service is started. This means we need to be able to extend the `requirement` of the `file-systems` Shepherd service. And we need to do that without putting any extra `/etc/fstab` entries since for "typical" ZFS style of managing file systems, they are required to ***not*** be put in `/etc/fstab`.
Looks like this fstab issue is the main reason why you felt the need to
define an extra service type. Why is it important that ZFS not be
listed in /etc/fstab?
Thanks,
Ludo’.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: ZFS on Guix, again
2021-02-22 8:57 ` Ludovic Courtès
@ 2021-02-23 1:11 ` raid5atemyhomework
2021-02-25 5:08 ` raid5atemyhomework
0 siblings, 1 reply; 6+ messages in thread
From: raid5atemyhomework @ 2021-02-23 1:11 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: guix-devel@gnu.org
Hi Ludo'
> Hi,
>
> Sorry for the delay; this isn’t as simple as it looks!
>
> I agree with 宋文武 regarding ‘file-system-service-type’.
>
> raid5atemyhomework raid5atemyhomework@protonmail.com skribis:
>
> > However, for the case where the user expects the "typical" ZFS style of managing file systems, we need to mount all the ZFS file systems and ensure that they aer all already mounted by the time `file-systems` Shepherd service is started. This means we need to be able to extend the `requirement` of the `file-systems` Shepherd service. And we need to do that without putting any extra `/etc/fstab` entries since for "typical" ZFS style of managing file systems, they are required to not be put in `/etc/fstab`.
>
> Looks like this fstab issue is the main reason why you felt the need to
> define an extra service type. Why is it important that ZFS not be
> listed in /etc/fstab?
Because on all non-Guix operating systems, they aren't listed in `/etc/fstab`:
* https://docs.oracle.com/cd/E19120-01/open.solaris/817-2271/gaztn/index.html
What ZFS users expect is that you just do something as simple as this:
# zpool create mypool raidz2 /dev/disk/by-id/ata-Generic_M0D3L_53R14LN0 /dev/disk/by-id/ata-Generic_M0D3L_53R14LN1 /dev/disk/by-id/ata-Generic_M0D3L_53R14LN2 /dev/disk/by-id/ata-Generic_M0D3L_53R14LN3 log mirror /dev/disk/by-id/ata-Generic_55DM0D3L_53R14LN0 /dev/disk/by-id/ata-Generic_55DM0D3L_53R14LN1 /dev/disk/by-id/ata-Generic_55DM0D3L_53R14LN2
And what happens is:
* The pool `mypool` is created containing a RAIDZ-2 of the 4 HDDs listed, with a separate log device consisting of a mirror of 3 SSDs.
* A filesystem `mypool` is created on the pool `mypool`.
* The `mypool` filesystem is mounted on `/mypool`.
* On all subsequent bootups, the `mypool` filesystem is mounted on `/mypool`.
In ZFS you are expected to have dozens of filesystems. If you have a new application, the general expectation is that you create a new filesystem for it. In general you might have one pool, or maybe two or three, but you host most of your data in multiple filesystems on that same pool.
So for example you might want to create a filesystem for videos, which are sequentially accessed and tend to be fairly large, so setting `recordsize=1M` makes sense (good for sequential access, not so much for random, and good for very large files measurable in dozens of megabytes).
# zfs create -o recordsize=1M -o mountpoint=/home/raid5atemyhomework/Videos mypool/videos
The above command does:
* The filesystem `videos` is created on the pool `mypool`.
* The `mypool/videos` filesystem is mounted on `/home/raid5atemyhomework/Videos`.
* On all subsequent bootups, the filesystem is mounted on `/home/raid5atemyhomework/Videos`.
Now I might also want to run say a PostgreSQL service.
* PostgreSQL allocates in page sizes of 8k, so `recordsize=8k` is best.
* PostgreSQL uses a journal, which has a different access pattern from the rest of the data. Journals are written sequentially and read sequentially, while the database itself is accessed randomly.
* The data should have `logbias=throughput` to optimize and reduce use of the ZIL SLOG, to avoid "log on a log" slowdown effects.
* The journal itself should continue to use the default "latency".
So I would do:
# zfs create -o recordsize=8k -o logbias=throughput -o mountpoint=/postgresql mypool/postgresql
# zfs create -o logbias=latency -o mountpoint=/postgresql/pg_wal mypool/postgresql/pg_wal
That means creating two filesystems for a single application, one for the PostgreSQL data, the other for the PostgreSQL journal.
What the above examples show is:
* The habit for a ZFS user is to create many filesystems. On my own homelab I have two filesystems (one for documents and code, one for videos and pictures) for data I manage myself, and I have two other filesystems for two different applications I am running as well.
* Each filesystem has different tuning properties.
On a server you might have a dozen or so ZFS filesystems for various applications you need to run. There are also many other tuning parameters to tweak. If done by `/etc/fstab` it would lead to a fairly large file.
The base logic here is that `/etc/fstab` has to be stored on disk anyway, and ZFS can just store the same information on the disks it is managing directly. Then ZFS supports nice tabulated output of properties via `zfs list`:
# zfs list -o name,recordsize,logbias,atime,relatime
NAME RECSIZE LOGBIAS ATIME RELATIME
hddpool 128K latency off on
hddpool/bitcoin 128K latency off on
hddpool/common 128K latency off on
hddpool/lightning 64K latency off on
hddpool/media 1M latency off on
And you can change parameters easily with `zfs set`. There are many dozens of possible properties as well.
Thus, the general expectation among ZFS users is ***not*** to use any kind of `/etc/fstab` at all, because such a `/etc/fstab` would be ludicrously large with a dozen filesystems and several properties. And the declarative `file-system` Guix syntax is really just `/etc/fstab` in another format. So the expectation for a ZFS user would be to keep using classic `zpool` and `zfs` commands to manage the filesystems and parameters.
The main purpose of the `operating-system` declaration is to allow the system to be brought back again, but the configuration file has to exist on *some* permanent storage anyway so the information might as well be managed by ZFS directly on the disks it is managing.
ZFS also allows snapshotting of the configuration of the pool, so this isn't really an advantage to keeping the configuration of the pool in the `operating-system` as well. You can rollback changes to the pool just as well as you can rollback `operating-system`.
Since this is the expected behavior of ZFS, we should support it as much as possible.
If the user wants to really manage ZFS via `file-system` declarations, they can set `mountpoint=legacy` and then the user can put them in `file-system` declarations that then become `/etc/fstab` entries. But if the user doesn't want to manage them via `file-system` declarations, we should also support that use-case (because that is how ZFS is meant to be used in other operating systems). So we need to have the `file-system` Shepherd service also wait for non-`/etc/fstab` filesystems like ZFS, not just those listed in `/etc/fstab`.
Thanks
raid5atemyhomework
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: ZFS on Guix, again
2021-02-23 1:11 ` raid5atemyhomework
@ 2021-02-25 5:08 ` raid5atemyhomework
2021-03-09 2:34 ` raid5atemyhomework
0 siblings, 1 reply; 6+ messages in thread
From: raid5atemyhomework @ 2021-02-25 5:08 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: guix-devel@gnu.org
Hi Ludo,
> > I agree with 宋文武 regarding ‘file-system-service-type’.
> > raid5atemyhomework raid5atemyhomework@protonmail.com skribis:
> >
> > > However, for the case where the user expects the "typical" ZFS style of managing file systems, we need to mount all the ZFS file systems and ensure that they aer all already mounted by the time `file-systems` Shepherd service is started. This means we need to be able to extend the `requirement` of the `file-systems` Shepherd service. And we need to do that without putting any extra `/etc/fstab` entries since for "typical" ZFS style of managing file systems, they are required to not be put in `/etc/fstab`.
> >
> > Looks like this fstab issue is the main reason why you felt the need to
> > define an extra service type. Why is it important that ZFS not be
> > listed in /etc/fstab?
>
> Because on all non-Guix operating systems, they aren't listed in`/etc/fstab`:
>
> - https://docs.oracle.com/cd/E19120-01/open.solaris/817-2271/gaztn/index.html
So what do we do here?
* Force all ZFS filesystems to be declared `mountpoint=legacy` and be written as `file-system` declarations in the `operating-system` (which will eventually reach `/etc/fstab`).
* This is undesirable since ZFS users expect that setting up mount points for ZPOOL and ZFS datasets are just handled by the same commands that create the ZPOOL and ZFS dataset. This is in contrast with other file systems where the creation of the filesystem is a separate step from adding its mount point.
* If a ZFS filesystem is created or destroyed (for example I might want to create a temporary filesystem to `zfs send` to in order to implement defragmentation, or to recompress data if I forgot to set `compression=on`) then the user has to edit the configuration file and then `guix system reconfigure` in order to make the changes stick. Most ZFS users just create and destroy ZFS datasets as part of maintenance.
* If Guix goes this way, most ZFS users (including me) will not consider ZFS support on Guix to be anywhere near "serviceable".
* Hack a `fstab?` field in `file-system` forms.
* Arguably bad design.
* Just split up the Shepherd service into a `file-systems-target-service-type` and have `file-systems-service-type` extend it, like I already proposed before.
---
Also how about `linux-loadable-modules-service-type`? Is the proposed design okay? Do we really want to name it `linux-loadable-modules-service-type` in contrast to the current `operating-system` field `kernel-loadable-modules`?
Thanks
raid5atemyhomework
Thanks
raid5atemyhomework
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: ZFS on Guix, again
2021-02-25 5:08 ` raid5atemyhomework
@ 2021-03-09 2:34 ` raid5atemyhomework
0 siblings, 0 replies; 6+ messages in thread
From: raid5atemyhomework @ 2021-03-09 2:34 UTC (permalink / raw)
To: Ludovic Courtès; +Cc: guix-devel@gnu.org
BUMP
> Hi Ludo,
>
> > > I agree with 宋文武 regarding ‘file-system-service-type’.
> > > raid5atemyhomework raid5atemyhomework@protonmail.com skribis:
> > >
> > > > However, for the case where the user expects the "typical" ZFS style of managing file systems, we need to mount all the ZFS file systems and ensure that they aer all already mounted by the time `file-systems` Shepherd service is started. This means we need to be able to extend the `requirement` of the `file-systems` Shepherd service. And we need to do that without putting any extra `/etc/fstab` entries since for "typical" ZFS style of managing file systems, they are required to not be put in `/etc/fstab`.
> > >
> > > Looks like this fstab issue is the main reason why you felt the need to
> > > define an extra service type. Why is it important that ZFS not be
> > > listed in /etc/fstab?
> >
> > Because on all non-Guix operating systems, they aren't listed in`/etc/fstab`:
> >
> > - https://docs.oracle.com/cd/E19120-01/open.solaris/817-2271/gaztn/index.html
>
> So what do we do here?
>
> - Force all ZFS filesystems to be declared `mountpoint=legacy` and be written as `file-system` declarations in the `operating-system` (which will eventually reach `/etc/fstab`).
> - This is undesirable since ZFS users expect that setting up mount points for ZPOOL and ZFS datasets are just handled by the same commands that create the ZPOOL and ZFS dataset. This is in contrast with other file systems where the creation of the filesystem is a separate step from adding its mount point.
> - If a ZFS filesystem is created or destroyed (for example I might want to create a temporary filesystem to `zfs send` to in order to implement defragmentation, or to recompress data if I forgot to set `compression=on`) then the user has to edit the configuration file and then `guix system reconfigure` in order to make the changes stick. Most ZFS users just create and destroy ZFS datasets as part of maintenance.
> - If Guix goes this way, most ZFS users (including me) will not consider ZFS support on Guix to be anywhere near "serviceable".
> - Hack a `fstab?` field in `file-system` forms.
> - Arguably bad design.
> - Just split up the Shepherd service into a `file-systems-target-service-type` and have `file-systems-service-type` extend it, like I already proposed before.
>
> Also how about`linux-loadable-modules-service-type`? Is the proposed design okay? Do we really want to name it `linux-loadable-modules-service-type` in contrast to the current `operating-system` field `kernel-loadable-modules`?
>
> Thanks
> raid5atemyhomework
>
> Thanks
> raid5atemyhomework
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-03-09 2:34 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-02-20 8:48 ZFS on Guix, again raid5atemyhomework
2021-02-20 11:44 ` 宋文武
2021-02-22 8:57 ` Ludovic Courtès
2021-02-23 1:11 ` raid5atemyhomework
2021-02-25 5:08 ` raid5atemyhomework
2021-03-09 2:34 ` raid5atemyhomework
Code repositories for project(s) associated with this public inbox
https://git.savannah.gnu.org/cgit/guix.git
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).