From mboxrd@z Thu Jan 1 00:00:00 1970 From: Mark H Weaver Subject: bug#24937: "deleting unused links" GC phase is too slow Date: Sun, 11 Dec 2016 09:23:38 -0500 Message-ID: <87twaaa6j9.fsf@netris.org> References: <87wpg7ffbm.fsf@gnu.org> <87lgvm4lzu.fsf@gnu.org> Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Return-path: Received: from eggs.gnu.org ([2001:4830:134:3::10]:48083) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cG52s-0007oC-Ei for bug-guix@gnu.org; Sun, 11 Dec 2016 09:24:07 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cG52p-00029o-92 for bug-guix@gnu.org; Sun, 11 Dec 2016 09:24:06 -0500 Received: from debbugs.gnu.org ([208.118.235.43]:50510) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1cG52p-00029j-5A for bug-guix@gnu.org; Sun, 11 Dec 2016 09:24:03 -0500 Received: from Debian-debbugs by debbugs.gnu.org with local (Exim 4.84_2) (envelope-from ) id 1cG52o-0004Qm-WB for bug-guix@gnu.org; Sun, 11 Dec 2016 09:24:03 -0500 Sender: "Debbugs-submit" Resent-Message-ID: In-Reply-To: <87lgvm4lzu.fsf@gnu.org> ("Ludovic \=\?utf-8\?Q\?Court\=C3\=A8s\=22'\?\= \=\?utf-8\?Q\?s\?\= message of "Sun, 11 Dec 2016 14:46:13 +0100") List-Id: Bug reports for GNU Guix List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: bug-guix-bounces+gcggb-bug-guix=m.gmane.org@gnu.org Sender: "bug-Guix" To: Ludovic =?UTF-8?Q?Court=C3=A8s?= Cc: 24937@debbugs.gnu.org ludo@gnu.org (Ludovic Court=C3=A8s) writes: > Here=E2=80=99s a proposed patch that follows your suggestion, Mark, but p= laces > an upper bound on the number of directory entries loaded in memory. > > On my laptop, which has ~500k entries in /gnu/store/.links, the result > is something like this (notice the inode numbers in =E2=80=98lstat=E2=80= =99 calls): > > 13738 write(4, "gmlo\0\0\0\0\31\0\0\0\0\0\0\0deleting unused "..., 48) = =3D 48 > 13738 open("/gnu/store/.links", O_RDONLY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC= ) =3D 8 > 13738 fstat(8, {st_dev=3Dmakedev(8, 2), st_ino=3D4014083, st_mode=3DS_IFD= IR|0755, st_nlink=3D2, st_uid=3D0, st_gid=3D0, st_blksize=3D4096, st_blocks= =3D119224, st_size=3D60977152, st_atime=3D2016/12/11-12:01:59, st_mtime=3D2= 016/12/11-09:39:45, st_ctime=3D2016/12/11-09:39:45}) =3D 0 > 13738 getdents(8, {{d_ino=3D4014083, d_off=3D4294967296, d_reclen=3D24, d= _name=3D"."} > [...] > 13738 lstat("/gnu/store/.links/1f2f3g8waxwymp9sl2slcfyara164i8w1y2sz3h9js= 2fcviv2rnc", {st_dev=3Dmakedev(8, 2), st_ino=3D47, st_mode=3DS_IFREG|0444, = st_nlink=3D2, st_uid=3D0, st_gid=3D0, st_blksize=3D4096, st_blocks=3D88, st= _size=3D41482, st_atime=3D2015/03/10-11:29:06, st_mtime=3D1970/01/01-01:00:= 01, st_ctime=3D2015/11/25-11:29:20}) =3D 0 > 13738 lstat("/gnu/store/.links/1p25kpyfw354in4kykmgh5sy9h925hnil1jdzgxhz7= n6abbws8px", {st_dev=3Dmakedev(8, 2), st_ino=3D65, st_mode=3DS_IFREG|0444, = st_nlink=3D9, st_uid=3D0, st_gid=3D0, st_blksize=3D4096, st_blocks=3D8, st_= size=3D2313, st_atime=3D2016/12/08-21:02:26, st_mtime=3D1970/01/01-01:00:01= , st_ctime=3D2016/12/11-01:44:49}) =3D 0 > 13738 lstat("/gnu/store/.links/163187g637br9ys5pmshb01wjav53bs1g1a83m7c2a= lpdyx3yqz2", {st_dev=3Dmakedev(8, 2), st_ino=3D68, st_mode=3DS_IFREG|0444, = st_nlink=3D2, st_uid=3D0, st_gid=3D0, st_blksize=3D4096, st_blocks=3D32, st= _size=3D13561, st_atime=3D2015/03/10-11:29:06, st_mtime=3D1970/01/01-01:00:= 01, st_ctime=3D2015/11/25-11:29:20}) =3D 0 > [...] > 13738 getdents(8, {{d_ino=3D4257114, d_off=3D1734093409898198492, > [...] > 13738 lstat("/gnu/store/.links/1m6g06i01ybbkhjjbirjnj7fckw1b772cwygkvbd6v= 6zgkln7f7m", {st_dev=3Dmakedev(8, 2), st_ino=3D19, st_mode=3DS_IFREG|0444, = st_nlink=3D4, st_uid=3D0, st_gid=3D0, st_blksize=3D4096, st_blocks=3D8, st_= size=3D2553, st_atime=3D2016/12/08-21:02:54, st_mtime=3D1970/01/01-01:00:01= , st_ctime=3D2016/12/07-00:05:19}) =3D 0 > 13738 lstat("/gnu/store/.links/1ml8n3q55ikn8h60sn67jq1y7z7mvdp5kwr33pqrid= 2r6kk1d4kb", {st_dev=3Dmakedev(8, 2), st_ino=3D30, st_mode=3DS_IFREG|0444, = st_nlink=3D2, st_uid=3D0, st_gid=3D0, st_blksize=3D4096, st_blocks=3D8, st_= size=3D2090, st_atime=3D2015/03/10-11:29:06, st_mtime=3D1970/01/01-01:00:01= , st_ctime=3D2015/11/25-11:29:21}) =3D 0 > 13738 lstat("/gnu/store/.links/1c8cvwlqyyqby3k13cwm40g26pwca5iiz5dcj43xrg= n9y91lfvc2", {st_dev=3Dmakedev(8, 2), st_ino=3D33, st_mode=3DS_IFREG|0444, = st_nlink=3D2, st_uid=3D0, st_gid=3D0, st_blksize=3D4096, st_blocks=3D16, st= _size=3D7958, st_atime=3D2015/11/04-18:55:32, st_mtime=3D1970/01/01-01:00:0= 1, st_ctime=3D2016/01/05-11:35:49}) =3D 0 > [...] > 13738 getdents(8, {{d_ino=3D328672, > [...] > 13738 lstat("/gnu/store/.links/1l55l59dxb74himmkfzx5v63cv7287i6rjhdns1fdl= wajqd73lnz", {st_dev=3Dmakedev(8, 2), st_ino=3D21, st_mode=3DS_IFREG|0444, = st_nlink=3D31, st_uid=3D0, st_gid=3D0, st_blksize=3D4096, st_blocks=3D8, st= _size=3D615, st_atime=3D2016/12/08-21:02:30, st_mtime=3D1970/01/01-01:00:01= , st_ctime=3D2016/12/11-01:44:47}) =3D 0 > 13738 lstat("/gnu/store/.links/1c7mm5amw743mb45f1zg4d4r3g549ch35wks9izkcg= kx0jirpxsg", {st_dev=3Dmakedev(8, 2), st_ino=3D48, st_mode=3DS_IFREG|0444, = st_nlink=3D2, st_uid=3D0, st_gid=3D0, st_blksize=3D4096, st_blocks=3D360, s= t_size=3D176750, st_atime=3D2015/03/10-11:29:06, st_mtime=3D1970/01/01-01:0= 0:01, st_ctime=3D2015/11/25-11:29:20}) =3D 0 > 13738 lstat("/gnu/store/.links/0z5s7b0yk8mfn1np6gk3cdbmpnjgxg1g0l8vfq1aa0= 1zwp06d3f0", {st_dev=3Dmakedev(8, 2), st_ino=3D61, st_mode=3DS_IFREG|0444, = st_nlink=3D2, st_uid=3D0, st_gid=3D0, st_blksize=3D4096, st_blocks=3D8, st_= size=3D1440, st_atime=3D2016/11/03-20:37:50, st_mtime=3D1970/01/01-01:00:01= , st_ctime=3D2016/11/07-00:01:57}) =3D 0 > > I can=E2=80=99t tell exactly how this affects performance because my mach= ine has > an SSD where this operation takes ~3 seconds on a cold cache. I suspect > it has performance comparable to that of doing all the =E2=80=98readdir= =E2=80=99 at once > followed by all the =E2=80=98lstat=E2=80=99. > > Mark, how does that sound? I think we should sort the entire directory using merge sort backed to disk files. If we load chunks of the directory, sort them and process them individually, I expect that this will increase the amount of I/O required by a non-trivial factor. In each pass, we would load blocks of inodes from disk, almost all of which are likely to be present in the store and thus linked from the directory, but in this scheme we will process only a small number of them and drop the rest on the floor to be read again in the next pass. Given that even my fairly optimal implementation takes about 35 minutes to run on Hydra, I'd prefer to avoid multiplying that by a non-trivial factor. Why not just use GNU sort? It already exists, and does exactly what we need. If you object to using an external program for some reason, I would prefer to re-implement a similar algorithm in the daemon. Thanks, Mark