unofficial mirror of bug-guix@gnu.org 
 help / color / mirror / code / Atom feed
From: ludo@gnu.org (Ludovic Courtès)
To: Mark H Weaver <mhw@netris.org>
Cc: 24937@debbugs.gnu.org
Subject: bug#24937: "deleting unused links" GC phase is too slow
Date: Tue, 13 Dec 2016 01:00:07 +0100	[thread overview]
Message-ID: <87d1gwvgu0.fsf@gnu.org> (raw)
In-Reply-To: <87lgvm9sgq.fsf@netris.org> (Mark H. Weaver's message of "Sun, 11 Dec 2016 14:27:33 -0500")

[-- Attachment #1: Type: text/plain, Size: 4071 bytes --]

Mark H Weaver <mhw@netris.org> skribis:

> ludo@gnu.org (Ludovic Courtès) writes:
>
>> Mark H Weaver <mhw@netris.org> skribis:
>>
>>> I think we should sort the entire directory using merge sort backed to
>>> disk files.  If we load chunks of the directory, sort them and process
>>> them individually, I expect that this will increase the amount of I/O
>>> required by a non-trivial factor.  In each pass, we would load blocks of
>>> inodes from disk, almost all of which are likely to be present in the
>>> store and thus linked from the directory, but in this scheme we will
>>> process only a small number of them and drop the rest on the floor to be
>>> read again in the next pass.  Given that even my fairly optimal
>>> implementation takes about 35 minutes to run on Hydra, I'd prefer to
>>> avoid multiplying that by a non-trivial factor.
>>
>> Sure, though it’s not obvious to me how much of a difference it makes;
>> my guess is that processing in large chunks is already a win, but we’d
>> have to measure.
>
> I agree, it would surely be a win.  Given that it currently takes on the
> order of a day to run this phase on Hydra, if your proposed method takes
> 2 hours, that would be a huge win, but still not good, IMO.  Even 35
> minutes is slower than I'd like.

Of course.

I did some measurements with the attached program on chapters, which is
a Xen VM with spinning disks underneath, similar to hydra.gnu.org.  It
has 600k entries in /gnu/store/.links.

Here’s a comparison of the “optimal” mode (bulk stats after we’ve
fetched all the dirents) vs. the “semi-interleaved” mode (doing bulk
stats every 100,000 dirents):

--8<---------------cut here---------------start------------->8---
ludo@guix:~$ gcc -std=gnu99 -Wall links-traversal.c  -DMODE=3
ludo@guix:~$ sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches'
ludo@guix:~$ time ./a.out
603858 dir_entries, 157 seconds
stat took 1 seconds

real    2m38.508s
user    0m0.324s
sys     0m1.824s
ludo@guix:~$ gcc -std=gnu99 -Wall links-traversal.c  -DMODE=2
ludo@guix:~$ sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches'
ludo@guix:~$ time ./a.out 
3852 dir_entries, 172 seconds (including stat)

real    2m51.827s
user    0m0.312s
sys     0m1.808s
--8<---------------cut here---------------end--------------->8---

Semi-interleaved is ~12% slower here (not sure how reproducible that is
though).

>>> Why not just use GNU sort?  It already exists, and does exactly what we
>>> need.
>>
>> Does ‘sort’ manage to avoid reading whole files in memory?
>
> Yes, it does.  I monitored the 'sort' process when I first ran my
> optimized pipeline.  It created about 10 files in /tmp, approximately 70
> megabytes each as I recall, and then read them all concurrently while
> writing the sorted output.
>
> My guess is that it reads a manageable chunk of the input, sorts it in
> memory, and writes it to a temporary file.  I guess it repeats this
> process, writing multiple temporary files, until the entire input is
> consumed, and then reads all of those temporary files, merging them
> together into the output stream.

OK.  That seems to be that the comment above ‘sortlines’ in sort.c
describes.

>>> If you object to using an external program for some reason, I would
>>> prefer to re-implement a similar algorithm in the daemon.
>>
>> Yeah, I’d rather avoid serializing the list of file names/inode number
>> pairs just to invoke ‘sort’ on that.
>
> Sure, I agree that it would be better to avoid that, but IMO not at the
> cost of using O(N) memory instead of O(1) memory, nor at the cost of
> multiplying the amount of disk I/O by a non-trivial factor.

Understood.

sort.c in Coreutils is very big, and we surely don’t want to duplicate
all that.  Yet, I’d rather not shell out to ‘sort’.

Do you know how many entries are in .links on hydra.gnu.org?  If it
performs comparably to chapters, the timings suggests it should have
around 10.5M entries.

Thanks!

Ludo’.


[-- Attachment #2: the code --]
[-- Type: text/plain, Size: 2014 bytes --]

#include <unistd.h>
#include <dirent.h>
#include <sys/types.h>
#include <stdlib.h>
#include <stdio.h>
#include <sys/time.h>
#include <string.h>
#include <sys/stat.h>
#include <assert.h>

#define STAT_INTERLEAVED 1
#define STAT_SEMI_INTERLEAVED 2
#define STAT_OPTIMAL 3

struct entry
{
  char *name;
  ino_t inode;
};

#define MAX_ENTRIES 1000000
static struct entry dir_entries[MAX_ENTRIES];

int
main ()
{
  struct timeval start, end;

  /* For useful timings, do:
     sudo sh -c 'echo 3 > /proc/sys/vm/drop_caches'  */
  gettimeofday (&start, NULL);
  DIR *links = opendir ("/gnu/store/.links");

  size_t count = 0;

#if MODE != STAT_INTERLEAVED
  void sort_entries (void)
  {
    int entry_lower (const void *a, const void *b)
    {
      return ((struct entry *)a)->inode < ((struct entry *)b)->inode;
    }

    qsort (dir_entries, count, sizeof (struct entry),
	   entry_lower);
  }
#endif

  void stat_entries (void)
  {
    for (size_t i = 0; i < count; i++)
      {
	struct stat st;
	lstat (dir_entries[i].name, &st);
      }
  }

  for (struct dirent *entry = readdir (links);
       entry != NULL;
       entry = readdir (links))
    {
      assert (count < MAX_ENTRIES);
      dir_entries[count].name = strdup (entry->d_name);
      dir_entries[count].inode = entry->d_ino;
#if MODE == STAT_INTERLEAVED
      struct stat st;
      lstat (entry->d_name, &st);
#endif

#if MODE == STAT_SEMI_INTERLEAVED
      if (count++ >= 100000)
	{
	  sort_entries ();
	  stat_entries ();
	  count = 0;
	}
#else
      count++;
#endif
    }

#if MODE == STAT_SEMI_INTERLEAVED
  sort_entries ();
  stat_entries ();
#endif

  gettimeofday (&end, NULL);
  printf ("%zi dir_entries, %zi seconds"
#if MODE != STAT_OPTIMAL
	  " (including stat)"
#endif
	  "\n", count,
	  end.tv_sec - start.tv_sec);

#if MODE == STAT_OPTIMAL
  sort_entries ();
  gettimeofday (&start, NULL);
  stat_entries ();
  gettimeofday (&end, NULL);

  printf ("stat took %zi seconds\n", end.tv_sec - start.tv_sec);
#endif

  return EXIT_SUCCESS;
}

  reply	other threads:[~2016-12-13  0:01 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-11-13 17:41 bug#24937: "deleting unused links" GC phase is too slow Ludovic Courtès
2016-12-09 22:43 ` Ludovic Courtès
2016-12-11 13:46 ` Ludovic Courtès
2016-12-11 14:23   ` Mark H Weaver
2016-12-11 18:02     ` Ludovic Courtès
2016-12-11 19:27       ` Mark H Weaver
2016-12-13  0:00         ` Ludovic Courtès [this message]
2016-12-13 12:48           ` Mark H Weaver
2016-12-13 17:02             ` Ludovic Courtès
2016-12-13 17:18               ` Ricardo Wurmus
2020-04-16 13:26                 ` Ricardo Wurmus
2020-04-16 14:27                   ` Ricardo Wurmus
2020-04-17  8:16                     ` Ludovic Courtès
2020-04-17  8:28                       ` Ricardo Wurmus
2016-12-13  4:09         ` Mark H Weaver
2016-12-15  1:19           ` Mark H Weaver
2021-11-09 14:44 ` Ludovic Courtès
2021-11-09 15:00   ` Ludovic Courtès
2021-11-11 20:59   ` Maxim Cournoyer
2021-11-13 16:56     ` Ludovic Courtès
2021-11-13 21:37       ` bug#24937: [PATCH 1/2] tests: Factorize 'file=?' Ludovic Courtès
2021-11-13 21:37         ` bug#24937: [PATCH 2/2] daemon: Do not deduplicate files smaller than 4 KiB Ludovic Courtès
2021-11-16 13:54           ` bug#24937: "deleting unused links" GC phase is too slow Ludovic Courtès
2021-11-13 21:45       ` Ludovic Courtès
2021-11-22  2:30 ` John Kehayias via Bug reports for GNU Guix

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://guix.gnu.org/

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87d1gwvgu0.fsf@gnu.org \
    --to=ludo@gnu.org \
    --cc=24937@debbugs.gnu.org \
    --cc=mhw@netris.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://git.savannah.gnu.org/cgit/guix.git

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).