From mboxrd@z Thu Jan  1 00:00:00 1970
From: ludo@gnu.org (Ludovic =?UTF-8?Q?Court=C3=A8s?=)
Subject: bug#24937: "deleting unused links" GC phase is too slow
Date: Sun, 11 Dec 2016 19:02:42 +0100
Message-ID: <87twaa2vjx.fsf@gnu.org>
References: <87wpg7ffbm.fsf@gnu.org> <87lgvm4lzu.fsf@gnu.org>
	<87twaaa6j9.fsf@netris.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: quoted-printable
Return-path: <bug-guix-bounces+gcggb-bug-guix=m.gmane.org@gnu.org>
Received: from eggs.gnu.org ([2001:4830:134:3::10]:54961)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <Debian-debbugs@debbugs.gnu.org>) id 1cG8Sn-0001Ar-3m
	for bug-guix@gnu.org; Sun, 11 Dec 2016 13:03:06 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <Debian-debbugs@debbugs.gnu.org>) id 1cG8Sk-0002wM-7H
	for bug-guix@gnu.org; Sun, 11 Dec 2016 13:03:05 -0500
Received: from debbugs.gnu.org ([208.118.235.43]:51196)
	by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16)
	(Exim 4.71) (envelope-from <Debian-debbugs@debbugs.gnu.org>)
	id 1cG8Sk-0002w7-4b
	for bug-guix@gnu.org; Sun, 11 Dec 2016 13:03:02 -0500
Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2)
	(envelope-from <Debian-debbugs@debbugs.gnu.org>) id 1cG8Sj-00033w-TN
	for bug-guix@gnu.org; Sun, 11 Dec 2016 13:03:01 -0500
Sender: "Debbugs-submit" <debbugs-submit-bounces@debbugs.gnu.org>
Resent-Message-ID: <handler.24937.B24937.148147938111766@debbugs.gnu.org>
In-Reply-To: <87twaaa6j9.fsf@netris.org> (Mark H. Weaver's message of "Sun, 11
	Dec 2016 09:23:38 -0500")
List-Id: Bug reports for GNU Guix <bug-guix.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/bug-guix>,
	<mailto:bug-guix-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/bug-guix/>
List-Post: <mailto:bug-guix@gnu.org>
List-Help: <mailto:bug-guix-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/bug-guix>,
	<mailto:bug-guix-request@gnu.org?subject=subscribe>
Errors-To: bug-guix-bounces+gcggb-bug-guix=m.gmane.org@gnu.org
Sender: "bug-Guix" <bug-guix-bounces+gcggb-bug-guix=m.gmane.org@gnu.org>
To: Mark H Weaver <mhw@netris.org>
Cc: 24937@debbugs.gnu.org

Mark H Weaver <mhw@netris.org> skribis:

> ludo@gnu.org (Ludovic Court=C3=A8s) writes:
>
>> Here=E2=80=99s a proposed patch that follows your suggestion, Mark, but =
places
>> an upper bound on the number of directory entries loaded in memory.
>>
>> On my laptop, which has ~500k entries in /gnu/store/.links, the result
>> is something like this (notice the inode numbers in =E2=80=98lstat=E2=80=
=99 calls):
>>
>> 13738 write(4, "gmlo\0\0\0\0\31\0\0\0\0\0\0\0deleting unused "..., 48) =
=3D 48
>> 13738 open("/gnu/store/.links", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXE=
C) =3D 8
>> 13738 fstat(8, {st_dev=3Dmakedev(8, 2), st_ino=3D4014083, st_mode=3DS_IF=
DIR|0755, st_nlink=3D2, st_uid=3D0, st_gid=3D0, st_blksize=3D4096, st_block=
s=3D119224, st_size=3D60977152, st_atime=3D2016/12/11-12:01:59, st_mtime=3D=
2016/12/11-09:39:45, st_ctime=3D2016/12/11-09:39:45}) =3D 0
>> 13738 getdents(8, {{d_ino=3D4014083, d_off=3D4294967296, d_reclen=3D24, =
d_name=3D"."}
>> [...]
>> 13738 lstat("/gnu/store/.links/1f2f3g8waxwymp9sl2slcfyara164i8w1y2sz3h9j=
s2fcviv2rnc", {st_dev=3Dmakedev(8, 2), st_ino=3D47, st_mode=3DS_IFREG|0444,=
 st_nlink=3D2, st_uid=3D0, st_gid=3D0, st_blksize=3D4096, st_blocks=3D88, s=
t_size=3D41482, st_atime=3D2015/03/10-11:29:06, st_mtime=3D1970/01/01-01:00=
:01, st_ctime=3D2015/11/25-11:29:20}) =3D 0
>> 13738 lstat("/gnu/store/.links/1p25kpyfw354in4kykmgh5sy9h925hnil1jdzgxhz=
7n6abbws8px", {st_dev=3Dmakedev(8, 2), st_ino=3D65, st_mode=3DS_IFREG|0444,=
 st_nlink=3D9, st_uid=3D0, st_gid=3D0, st_blksize=3D4096, st_blocks=3D8, st=
_size=3D2313, st_atime=3D2016/12/08-21:02:26, st_mtime=3D1970/01/01-01:00:0=
1, st_ctime=3D2016/12/11-01:44:49}) =3D 0
>> 13738 lstat("/gnu/store/.links/163187g637br9ys5pmshb01wjav53bs1g1a83m7c2=
alpdyx3yqz2", {st_dev=3Dmakedev(8, 2), st_ino=3D68, st_mode=3DS_IFREG|0444,=
 st_nlink=3D2, st_uid=3D0, st_gid=3D0, st_blksize=3D4096, st_blocks=3D32, s=
t_size=3D13561, st_atime=3D2015/03/10-11:29:06, st_mtime=3D1970/01/01-01:00=
:01, st_ctime=3D2015/11/25-11:29:20}) =3D 0
>> [...]
>> 13738 getdents(8, {{d_ino=3D4257114, d_off=3D1734093409898198492,
>> [...]
>> 13738 lstat("/gnu/store/.links/1m6g06i01ybbkhjjbirjnj7fckw1b772cwygkvbd6=
v6zgkln7f7m", {st_dev=3Dmakedev(8, 2), st_ino=3D19, st_mode=3DS_IFREG|0444,=
 st_nlink=3D4, st_uid=3D0, st_gid=3D0, st_blksize=3D4096, st_blocks=3D8, st=
_size=3D2553, st_atime=3D2016/12/08-21:02:54, st_mtime=3D1970/01/01-01:00:0=
1, st_ctime=3D2016/12/07-00:05:19}) =3D 0
>> 13738 lstat("/gnu/store/.links/1ml8n3q55ikn8h60sn67jq1y7z7mvdp5kwr33pqri=
d2r6kk1d4kb", {st_dev=3Dmakedev(8, 2), st_ino=3D30, st_mode=3DS_IFREG|0444,=
 st_nlink=3D2, st_uid=3D0, st_gid=3D0, st_blksize=3D4096, st_blocks=3D8, st=
_size=3D2090, st_atime=3D2015/03/10-11:29:06, st_mtime=3D1970/01/01-01:00:0=
1, st_ctime=3D2015/11/25-11:29:21}) =3D 0
>> 13738 lstat("/gnu/store/.links/1c8cvwlqyyqby3k13cwm40g26pwca5iiz5dcj43xr=
gn9y91lfvc2", {st_dev=3Dmakedev(8, 2), st_ino=3D33, st_mode=3DS_IFREG|0444,=
 st_nlink=3D2, st_uid=3D0, st_gid=3D0, st_blksize=3D4096, st_blocks=3D16, s=
t_size=3D7958, st_atime=3D2015/11/04-18:55:32, st_mtime=3D1970/01/01-01:00:=
01, st_ctime=3D2016/01/05-11:35:49}) =3D 0
>> [...]
>> 13738 getdents(8, {{d_ino=3D328672,
>> [...]
>> 13738 lstat("/gnu/store/.links/1l55l59dxb74himmkfzx5v63cv7287i6rjhdns1fd=
lwajqd73lnz", {st_dev=3Dmakedev(8, 2), st_ino=3D21, st_mode=3DS_IFREG|0444,=
 st_nlink=3D31, st_uid=3D0, st_gid=3D0, st_blksize=3D4096, st_blocks=3D8, s=
t_size=3D615, st_atime=3D2016/12/08-21:02:30, st_mtime=3D1970/01/01-01:00:0=
1, st_ctime=3D2016/12/11-01:44:47}) =3D 0
>> 13738 lstat("/gnu/store/.links/1c7mm5amw743mb45f1zg4d4r3g549ch35wks9izkc=
gkx0jirpxsg", {st_dev=3Dmakedev(8, 2), st_ino=3D48, st_mode=3DS_IFREG|0444,=
 st_nlink=3D2, st_uid=3D0, st_gid=3D0, st_blksize=3D4096, st_blocks=3D360, =
st_size=3D176750, st_atime=3D2015/03/10-11:29:06, st_mtime=3D1970/01/01-01:=
00:01, st_ctime=3D2015/11/25-11:29:20}) =3D 0
>> 13738 lstat("/gnu/store/.links/0z5s7b0yk8mfn1np6gk3cdbmpnjgxg1g0l8vfq1aa=
01zwp06d3f0", {st_dev=3Dmakedev(8, 2), st_ino=3D61, st_mode=3DS_IFREG|0444,=
 st_nlink=3D2, st_uid=3D0, st_gid=3D0, st_blksize=3D4096, st_blocks=3D8, st=
_size=3D1440, st_atime=3D2016/11/03-20:37:50, st_mtime=3D1970/01/01-01:00:0=
1, st_ctime=3D2016/11/07-00:01:57}) =3D 0
>>
>> I can=E2=80=99t tell exactly how this affects performance because my mac=
hine has
>> an SSD where this operation takes ~3 seconds on a cold cache.  I suspect
>> it has performance comparable to that of doing all the =E2=80=98readdir=
=E2=80=99 at once
>> followed by all the =E2=80=98lstat=E2=80=99.
>>
>> Mark, how does that sound?
>
> I think we should sort the entire directory using merge sort backed to
> disk files.  If we load chunks of the directory, sort them and process
> them individually, I expect that this will increase the amount of I/O
> required by a non-trivial factor.  In each pass, we would load blocks of
> inodes from disk, almost all of which are likely to be present in the
> store and thus linked from the directory, but in this scheme we will
> process only a small number of them and drop the rest on the floor to be
> read again in the next pass.  Given that even my fairly optimal
> implementation takes about 35 minutes to run on Hydra, I'd prefer to
> avoid multiplying that by a non-trivial factor.

Sure, though it=E2=80=99s not obvious to me how much of a difference it mak=
es;
my guess is that processing in large chunks is already a win, but we=E2=80=
=99d
have to measure.

> Why not just use GNU sort?  It already exists, and does exactly what we
> need.

Does =E2=80=98sort=E2=80=99 manage to avoid reading whole files in memory? =
 If not, I
think it wouldn=E2=80=99t buy us anything to use it.

> If you object to using an external program for some reason, I would
> prefer to re-implement a similar algorithm in the daemon.

Yeah, I=E2=80=99d rather avoid serializing the list of file names/inode num=
ber
pairs just to invoke =E2=80=98sort=E2=80=99 on that.

Also, what algorithm are you referring to?

Thanks for the quick feedback!

Ludo=E2=80=99.