From: Bengt Richter <bokr@bokr.com>
To: Ricardo Wurmus <rekado@elephly.net>
Cc: Mathieu Othacehe <othacehe@gnu.org>, 51787@debbugs.gnu.org
Subject: bug#51787: Disk performance on ci.guix.gnu.org
Date: Wed, 22 Dec 2021 00:20:24 +0100 [thread overview]
Message-ID: <20211221232024.GA41746@LionPure> (raw)
In-Reply-To: <87v8zhn9m1.fsf@elephly.net>
Hi Ricardo,
TL;DR: re: "Any ideas?" :)
Read this [0], and consider how file systems may be
interacting with with SSD wear-leveling algorithms.
Are some file systems dependent on successful speculative
transaction continuations, while others might slow down
waiting for signs that an SSD controller has committed one
of ITS transactions, e.g. in special cases where the user or
kernel file system wants to be sure metadata is
written/journaled for fs structural integrity, but maybe
cares less about data?
I guess this difference might show up in copying a large
file over-writing the same target file (slower) vs copying
to a series of new files (faster).
What happens if you use a contiguous file as swap space?
Or, if you use anonymous files as user data space buffers,
passing them to wayland as file handles, per its protocol,
can you do better than ignoring SSD controllers and/or
storage hardware altogether?
Reference [0] is from 2013, so probably much has happened
since then, and the paper mentions (which has probably not
gotten better), the following, referring to trade secrets
giving one manufacturer ability to produce longer-lasting
SSDs cheaper and better than others ...
--8<---------------cut here---------------start------------->8---
This means that the SSD controller is dedicated to a
single brand of NAND, and it means that the SSD maker
can’t shop around among NAND suppliers for the best price.
Furthermore, the NAND supplier won’t share this
information unless it believes that there is some compelling
reason to work the SSD manufacturer. Since there are
hundreds of SSD makers it’s really difficult to get these
companies to pay attention to you! The SSD manufacturers
that have this kind of relationship with their flash
suppliers are very rare and very special.
--8<---------------cut here---------------end--------------->8---
Well, maybe you will have to parameterize your file system
tuning with manufacturer ID and SSD controller firmware
version ;/
Mvh, Bengt
[0] https://www.snia.org/sites/default/files/SSSITECHNOTES_HowControllersMaximizeSSDLife.pdf
On +2021-12-21 18:26:03 +0100, Ricardo Wurmus wrote:
> Today we discovered a few more things and discussed them on IRC. Here’s
> a summary.
>
> /var/cache sits on the same storage as /gnu. We mounted the 5TB ext4
> file system that’s hosted by the SAN at /mnt_test and started copying
> over /var/cache to /mnt_test/var/cache. Transfer speed was considerably
> faster (not *great*, but reasonably fast) than the copy of
> /gnu/store/trash to the same target.
>
> This confirmed our suspicions that the problem is not with the storage
> array but due to the fact that /gnu/store/trash (and also /gnu/store)
> is an extremely large, flat directory. /var/cache is not.
>
> Here’s what we do now: continue copying /var/cache to the SAN, then
> remount to serve substitutes from there. This removes some pressure
> from the file system as it will only be used for /gnu. We’re
> considering to dump the file system completely (i.e. reinstall the
> server), thereby emptying /gnu, but leaving the stash of built
> substitutes in /var/cache (hosted from the faster SAN).
>
> We could take this opportunity to reformat /gnu with btrfs, which
> performs quite a bit more poorly than ext4 but would be immune to
> defragmentation. It’s not clear that defragmentation matters here. It
> could just be that the problem is exclusively caused by having these
> incredibly large, flat /gnu/store, /gnu/store/.links, and
> /gnu/store/trash directories.
>
> A possible alternative for this file system might also be XFS, which
> performs well when presented with unreasonably large directories.
>
> It may be a good idea to come up with realistic test scenarios that we
> could test with each of these three file systems at scale.
>
> Any ideas?
>
> --
> Ricardo
>
>
>
(sorry, the top-post grew)
--
Regards,
Bengt Richter
next prev parent reply other threads:[~2021-12-21 23:21 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <875yrjpi1y.fsf@elephly.net>
2021-12-20 16:59 ` bug#51787: Disk performance on ci.guix.gnu.org Mathieu Othacehe
2021-12-20 17:05 ` Ricardo Wurmus
2021-12-20 21:53 ` Mathieu Othacehe
2021-12-21 17:26 ` Ricardo Wurmus
2021-12-21 17:51 ` Leo Famulari
2021-12-21 18:23 ` Mathieu Othacehe
2021-12-21 23:20 ` Bengt Richter [this message]
2021-12-22 0:27 ` Thiago Jung Bauermann via Bug reports for GNU Guix
2021-12-25 22:19 ` Ricardo Wurmus
2021-12-26 8:53 ` Mathieu Othacehe
2021-12-30 10:44 ` Ricardo Wurmus
2021-12-20 18:36 ` Bengt Richter
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20211221232024.GA41746@LionPure \
--to=bokr@bokr.com \
--cc=51787@debbugs.gnu.org \
--cc=othacehe@gnu.org \
--cc=rekado@elephly.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
Code repositories for project(s) associated with this external index
https://git.savannah.gnu.org/cgit/guix.git
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.